xc-llm-ascend

Author	SHA1	Message	Date
wangxiyuan	835b4c8f1d	Drop torchair (#4814 ) aclgraph is stable and fast now. Let's drop torchair graph mode now. TODO: some logic to adapt torchair should be cleaned up as well. We'll do it in the following PR. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-12-10 09:20:40 +08:00
wangxiaoteng888	a77045f355	[P/D][main]Offline the llmdatadist connector related parts of the code and files. (#4780 ) ### What this PR does / why we need it? As support for the mooncake connector is now available, the llmdatadist connector is no longer being maintained, so the llmdatadist-related files need to be retired. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: liziyu <liziyu16@huawei.com> Co-authored-by: liziyu <liziyu16@huawei.com>	2025-12-09 22:36:43 +08:00
Li Wang	4813cefc58	[CI] Setup github proxy for self_hosted runners (#4841 ) ### What this PR does / why we need it? Setup github proxy for self_hosted runners --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-12-09 20:35:43 +08:00
Li Wang	9038865261	[CI] Optimize CI time (#4821 ) ### What this PR does / why we need it? Considering that long queues severely impact the developer experience, we have decided to make the following changes: 1. Changes will use the self_hosted runner 2. e2e-2card will use the A3 node. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wangli <wangli858794774@gmail.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-12-09 16:09:37 +08:00
dependabot[bot]	3c3c9a5386	Bump actions/checkout from 6.0.0 to 6.0.1 (#4772 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.0 to 6.0.1. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-08 19:15:40 +08:00
Li Wang	4b016b98a2	[CI] Fix unit test fault `no space left` (#4728 ) ### What this PR does / why we need it? Using an ARM-based github_hosted node to temporarily resolve `no space left` issues when installing vllm in UT. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wangli <wangli858794774@gmail.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>	2025-12-05 17:21:30 +08:00
wangxiyuan	ea54388e19	Drop ascend scheduler (#4623 ) It's safe to drop ascend scheduler now. The related test and doc has been removed already - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-05 09:03:45 +08:00
Li Wang	752a55473c	[Misc] Upgrade vllm vllm commit to 2025_12_04 (#4690 ) ### What this PR does / why we need it? As title shows, upgrade vllm commit hash to `ad32e3e` - vLLM version: v0.12.0 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-04 22:31:45 +08:00
wangxiyuan	da84eb2f40	Remove ascend schuduler ut (#4684 ) ### What this PR does / why we need it? 1. Remove ascend schuduler ut 2. Remove models ut 3. move mla to ops 4. skip the failed ut - vLLM version: v0.12.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-04 14:10:28 +08:00
wangxiyuan	3f4c0ea0a0	upgrade vLLM to 0.12.0 tag (#4647 ) Upgrade vLLM to v0.12.0 tag - vLLM version: 86e178f7c4d8c3b0eaf3c8e3f810a83f63b90e24 - vLLM main: `86e178f7c4` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-03 23:43:05 +08:00
Icey	6ac5730640	[CI] Fix ut ci: no space on the device (#4662 ) ### What this PR does / why we need it? The current ut ci is encountering a disk space shortage issue, see https://github.com/vllm-project/vllm-ascend/actions/runs/19884417749/job/56990694594?pr=4168 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new added/existing test. - vLLM version: 86e178f7c4d8c3b0eaf3c8e3f810a83f63b90e24 - vLLM main: `86e178f7c4` --------- Signed-off-by: wxsIcey <1790571317@qq.com>	2025-12-03 17:35:06 +08:00
wangxiyuan	7f2673ea2d	upgrade vLLM to main (#4608 ) 1. fix https://github.com/vllm-project/vllm/pull/28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix https://github.com/vllm-project/vllm/pull/29121 the output token now type changed from np to `list[list[int]]` 3. fix https://github.com/vllm-project/vllm/pull/29262 `xformers` backend for multimodal now has been deprecated 4. fix https://github.com/vllm-project/vllm/pull/29342 5. fix https://github.com/vllm-project/vllm/pull/28579 6. fix https://github.com/vllm-project/vllm/pull/28718 7. fix https://github.com/vllm-project/vllm/issues/28665 8. fix https://github.com/vllm-project/vllm/pull/26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix https://github.com/vllm-project/vllm/pull/29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>	2025-12-02 22:10:52 +08:00
dependabot[bot]	e18e3067a7	Bump actions/checkout from 4.3.1 to 6.0.0 (#4592 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 4.3.1 to 6.0.0. - vLLM version: v0.11.2 Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-02 11:59:25 +08:00
SILONG ZENG	ab37a7d5ae	[main]Upgrade cann to 8.3rc2 (#4350 ) ### What this PR does / why we need it? Upgrade cann to 8.3rc2 ### Does this PR introduce _any_ user-facing change? Yes, docker image will use 8.3.RC2 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: MrZ20 <2609716663@qq.com>	2025-11-28 14:06:01 +08:00
zzzzwwjj	1fd56b1106	chip type judgement code optimization (#4485 ) ### What this PR does / why we need it? \| \| cpu envir \| npu envir \| \|---\|---\|---\| \| set `SOC_VERSION` \| check if `SOC_VERSION` is in dict `soc_to_device`, if not, raise an error that can not support current chip type. \| print a warning log when `SOC_VERSION` is not equal to chip type from `npu-smi`, same as left for others. \| \| not set `SOC_VERSION` \| raise an error that `SOC_VERSION` is necessary when compiling in a cpu envir. \| use chip type from `npu-smi` to compile vllm-ascend. \| ### Does this PR introduce _any_ user-facing change? Now we must set env `SOC_VERSION` when compiling in cpu envir. ### How was this patch tested? - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: zzzzwwjj <1183291235@qq.com>	2025-11-27 17:18:49 +08:00
wangxiyuan	a91e76cd84	[CI] clean up ci (#4452 ) 1. Run 4-card test only when single and 2-card test passed 2. rename file to make it more clear 3. remove useless pd workflow, it has been managed by nightly test already. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-26 14:07:56 +08:00

16 Commits