xc-llm-ascend

Author	SHA1	Message	Date
meihanc	592cfb6a6f	[CI] Add Triton Ascend in CI (#4921 ) Add triton-ascend in UT and e2e - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>	2025-12-23 12:47:35 +08:00
zhenwenqi2024	4721e4f53f	[bugfix] asyncscheduler bug fix (#4968 ) ### What this PR does / why we need it? now vllm-ascend uses AsyncGPUModelRunnerOutput ,AsyncNPUModelRunnerOutput before is outdated, so we should fix it - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: zhenwenqi2024 <zhenwenqi_2022@qq.com>	2025-12-13 17:04:54 +08:00
Ronald	1a7a34c5ec	add e2e test for mtp async_scheduling (#4826 ) ### What this PR does / why we need it? add e2e test for mtp async scheduling ### Does this PR introduce _any_ user-facing change? no - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: Ronald1995 <ronaldautomobile@163.com>	2025-12-10 11:30:22 +08:00
Ronald	3480094d7c	support async mtp (#4511 ) ### What this PR does / why we need it? this pr aims to support async_scheduling for mtp, which refer to vllm pr https://github.com/vllm-project/vllm/pull/24799. and this pr fix some synchronize problem in vllm-ascend. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: Ronald1995 <ronaldautomobile@163.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-06 17:15:57 +08:00