xc-llm-ascend

Author	SHA1	Message	Date
Yikun Jiang	96089b5155	Add vLLM 0.11.0 release hourly job (#3215 ) ### What this PR does / why we need it? Add vLLM 0.11.0 release hourly job to monitor release branch changes ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-27 23:15:41 +08:00
wangxiyuan	e9359bd8fa	[CI] Pin vLLM to releases/v0.11.0 (#3211 ) ### What this PR does / why we need it? - Pin vLLM commit to releases/v0.11.0 branch. - Fix the break change by vLLM commit `d4d9899860` ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: `17b4c6685c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-27 10:41:48 +08:00
yupeng	9caf6fbaf5	[Bugfix][LoRA] Fix LoRA bug after supporting Qwen3-Next (#3044 ) ### What this PR does / why we need it? LoRA e2e test uses ilama-3.2-1B model. It uses transformers.py model files. Its self-attention layer names end with "\.attn", not "\.self_attn". There are some other model attention layer names end with "*.attn", such as baichuan.py, bert.py. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_ilama_lora.py pytest -sv tests/e2e/multicard/test_ilama_lora_tp2.py - vLLM version: v0.10.2 - vLLM main: `17b4c6685c` --------- Signed-off-by: paulyu12 <507435917@qq.com>	2025-09-26 11:12:45 +08:00
wangxiyuan	2930e4a6bd	[CI] Upgrade vllm to newest commit (#3182 ) ### What this PR does / why we need it? Upgrade vLLM to newest commit - Fix the aclgraph doesn't work problem, caused by `24fab45d96` - Fix PoolerOutput import error, caused by `755ed7b05b` - Fix the aclgraph weight load error to keep the same with torchair fix. `4492e3a554` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? All test should pass - vLLM version: v0.10.2 - vLLM main: `52d0cb8458` --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-26 06:18:15 +08:00
wangxiyuan	f7a3815bff	[CI] Do not drop ready label when PR is merge conflict (#3173 ) ### What this PR does / why we need it? `ready` label now is used for trigger full e2e test now. If a PR is ready and merge conflict then, no need to drop the ready label. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Just a github action change. No need for function test. - vLLM version: v0.10.2 - vLLM main: `52d0cb8458` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-25 18:45:19 +08:00
wangxiyuan	ac1c2cd9ac	[CI] Upgrade vllm version - 0925 (#3167 ) Upgrade vLLM to newest commit. 1. Remove the useless func get_state_cls, it has been removed from vLLM already. `e6750d0b18` 2. Fix ut broken by `6160ba4151` - vLLM version: v0.10.2 - vLLM main: `b1068903fd` --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-25 14:20:10 +08:00
wangxiyuan	a055183821	[CI] Upgrade vLLM version (#3139 ) Upgrade vLLM version to the newest commit. - Fix the break change introduced by `969b4da3a6` - Add a patch to quick fix torhcair `de94289a98` - fix the ut error introduced by `de94289a98` Close: https://github.com/vllm-project/vllm-ascend/issues/3138 - vLLM version: v0.10.2 - vLLM main: `f225ea7dd9` --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Co-authored-by: MengqingCao <cmq0113@163.com>	2025-09-25 07:36:51 +08:00
Li Wang	96eb1ed408	[CI] Bump vLLM commit hash to 0923(f225ea7) (#3110 ) ### What this PR does / why we need it? Bump vLLM commit hash to `f225ea7dd9` ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: `5aeb925452` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-09-23 14:13:25 +08:00
Li Wang	02f89d166f	[CI] Update vllm version to 20250922(5aeb925) (#3091 ) ### What this PR does / why we need it? This pr bump vllm commit hash to `5aeb925452` fix issues: 1. https://github.com/vllm-project/vllm/pull/25345 has remove v0 metadata 2. https://github.com/vllm-project/vllm/pull/25332 3. https://github.com/vllm-project/vllm/pull/25334 4. https://github.com/vllm-project/vllm/pull/23558, note that this vllm commit update the model register logic, which will check all the model registered have the `vllm.model_executor.models` path , which breaks our custom registration of the deepseek_v3 model (it doesn't exist in the vllm model path). so I move deepseek_v3 model registy to deepseek_v2 to solve temporary ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: `9607d5eb44` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-09-22 22:18:13 +08:00
Mengqing Cao	f39bd309b6	[Hybrid KV] Follow up UniformTypeKVCacheSpecs (#3070 ) ### What this PR does / why we need it? Follow up `UniformTypeKVCacheSpecs` changes introduced by https://github.com/vllm-project/vllm/pull/25101, which support different hidden size in uniform type kvcache specs This also fix the CI issue about `TypeError: AttentionGroup.__init__() missing 1 required positional argument: 'kv_cache_spec'` ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? Tests passed with exsiting e2e tests. - vLLM version: v0.10.2 - vLLM main: `c60e6137f0` --------- Signed-off-by: MengqingCao <cmq0113@163.com>	2025-09-22 15:02:41 +08:00
dependabot[bot]	37a0b3f25e	Bump actions/labeler from 5 to 6 (#3086 ) Bumps [actions/labeler](https://github.com/actions/labeler) from 5 to 6. - vLLM version: v0.10.2 - vLLM main: `c60e6137f0` Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-22 14:07:37 +08:00
wangxiyuan	88d24cce8b	[CI] Enable main based lint check and light ci matrix (#3079 ) ### What this PR does / why we need it? Followup on https://github.com/vllm-project/vllm-ascend/pull/3064 1. should limit vllm version to the same hash with mypy 2. fix the vllm version bug for e2e light test. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? CI passed - vLLM version: v0.10.2 - vLLM main: `c60e6137f0` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-22 10:37:53 +08:00
Yikun Jiang	693f547ccf	Refactor ci to reuse base workflow and re-enable ut coverage (#3064 ) ### What this PR does / why we need it? 1. Refactor ci to reuse base workflow and enable main 2 hours trigger job: - Extract e2e test in to _e2e_test.yaml - Reuse _e2e_test in light / full job - Enable main 2 hours trigger job 2. Rename e2e test to ascend test to make sure action display label 3. Re-enable ut coverage which was failed since `5bcb4c1528` and disable on `6d8bc38c7b` ### Does this PR introduce _any_ user-facing change? Only developer behavior changes: - Every job trigger full test with vllm release and hash - Run full job per 2 hours with vllm main - e2e light test (30 mins): `lint` (6mins) ---> ut (10mins) ---> `v0.10.2 + main / 4 jobs` (15mins) - e2e full test (1.5h): `ready label` ---> `v0.10.2 + main / 4 jobs`, about 1.5h - schedule test: 2hours ---> `v0.10.2 + main / 4 jobs`, about 1.5h ### How was this patch tested? CI passed - vLLM version: v0.10.2 - vLLM main: `c60e6137f0` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-21 13:27:08 +08:00
Yikun Jiang	b8b68b3dfe	[CI] Upgrade vLLM to 20250920 (c60e613) and address config break (#3067 ) ### What this PR does / why we need it? Bump main to `c60e6137f0` - Updated imports in `vllm.config` to `vllm.config.model`(`aed16879a9`) https://github.com/vllm-project/vllm/pull/25252 - Refactored `vllm_ascend/sample/sampler.py` to use string values for `logprobs_mode` instead of the `LogprobsMode` enum, simplifying logprobs mode handling and improving compatibility with recent vLLM changes (`aed16879a9`) https://github.com/vllm-project/vllm/pull/25252 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed - vLLM version: v0.10.2 - vLLM main: `6d8246aaff` --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-21 09:49:17 +08:00
Li Wang	12bcbd02bb	[CI] Upgrade vLLM to 20250919 (6d8246aa) and fix some broken issue (#2907 ) ### What this PR does / why we need it? 1. This pr bump vllm commit to `6d8246aaff` 2. fix upstream changes https://github.com/vllm-project/vllm/pull/24548 abort multi-modal kwargs, make vllm main and `v0.10.2` both adaptable 3. fix metadata_builder changes introduced by https://github.com/vllm-project/vllm/pull/23693 4. fix `structured_outputs_config` changes introduced by https://github.com/vllm-project/vllm/pull/22772 5. fix `moe_config` changes introduced by https://github.com/vllm-project/vllm/pull/22537 Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com> - vLLM version: v0.10.2 - vLLM main: `c60e6137f0` --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Co-authored-by: MengqingCao <cmq0113@163.com>	2025-09-20 17:37:57 +08:00
Icey	acb46f303f	Fix VocabParallelEmbedding UT (#2722 ) ### What this PR does / why we need it? Fix VocabParallelEmbedding UT ### How was this patch tested? CI passed with new added/existing test. - vLLM version: main - vLLM main: `f592b3174b` --------- Signed-off-by: Icey <1790571317@qq.com>	2025-09-18 19:54:01 +08:00
dependabot[bot]	3e60aa5483	Bump actions/setup-python from 5.4.0 to 6.0.0 (#2926 ) Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.4.0 to 6.0.0. - vLLM version: v0.10.2 - vLLM main: `3f3313981c` Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-16 14:15:10 +08:00
wangxiyuan	048bfd5553	[Release] Add release note for v0.10.2rc1 (#2921 ) Add release note for v0.10.2rc1 - vLLM version: v0.10.2 - vLLM main: `b834b4cbf1` --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-16 01:20:05 +08:00
wangxiyuan	c556038ef0	[New model] Qwen3-next support (#2917 ) ### What this PR does / why we need it? Add Qwen3-next support. ### Does this PR introduce _any_ user-facing change? Yes, users can use Qwen3 next. Related doc: https://github.com/vllm-project/vllm-ascend/pull/2916 the tutorial will be ready in [here](https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_npu_qwen3_next.html) ### How was this patch tested? Doc CI passed Related: https://github.com/vllm-project/vllm-ascend/issues/2884 Co-Authored-By: Angazenn <supperccell@163.com> Co-Authored-By: zzzzwwjj <1183291235@qq.com> Co-Authored-By: MengqingCao <cmq0113@163.com> Co-Authored-By: linfeng-yuan <1102311262@qq.com> Co-Authored-By: hust17yixuan <303660421@qq.com> Co-Authored-By: SunnyLee219 <3294305115@qq.com> Co-Authored-By: maoxx241 <maoxx241@umn.edu> - vLLM version: v0.10.2 - vLLM main: `b834b4cbf1` --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Angazenn <supperccell@163.com> Signed-off-by: Your Name <you@example.com> Signed-off-by: zzzzwwjj <1183291235@qq.com> Signed-off-by: linfeng-yuan <1102311262@qq.com> Signed-off-by: hust17yixuan <303660421@qq.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: Angazenn <supperccell@163.com> Co-authored-by: Your Name <you@example.com> Co-authored-by: zzzzwwjj <1183291235@qq.com> Co-authored-by: linfeng-yuan <1102311262@qq.com> Co-authored-by: hust17yixuan <303660421@qq.com>	2025-09-16 01:17:42 +08:00
Yikun Jiang	0747a6e68c	Bump vLLM version to v0.10.2 (#2914 ) ### What this PR does / why we need it? Bump vLLM version to v0.10.2 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed - vLLM version: v0.10.2rc3 - vLLM main: `15b8fef453` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-14 06:57:59 +08:00
Yikun Jiang	f97a64ba7f	Bump vLLM version to v0.10.2rc3 (#2911 ) ### What this PR does / why we need it? Bump vLLM version to v0.10.2rc3 https://github.com/vllm-project/vllm/compare/v0.10.2rc2...v0.10.2rc3 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed - vLLM version: v0.10.2rc2 - vLLM main: `15b8fef453` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-13 19:15:48 +08:00
Yikun Jiang	8ece6956e7	Revert "Upgrade CANN version to 8.3.rc1.alpha001 (#2903 )" (#2909 ) ### What this PR does / why we need it? This reverts commit `339fceb89c`. ### Does this PR introduce _any_ user-facing change? Yes, use 8.2rc1 image by default ### How was this patch tested? CI passed - vLLM version: v0.10.2rc2 - vLLM main: `cfa3234a5b` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-13 16:21:54 +08:00
Yikun Jiang	d2250c80b5	Enable push trigger for image job (#2906 ) ### What this PR does / why we need it? Enable push trigger for image job ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Followup on https://github.com/vllm-project/vllm-ascend/pull/2864 - vLLM version: v0.10.2rc2 - vLLM main: `89e08d6d18` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-13 12:31:36 +08:00
Yikun Jiang	339fceb89c	Upgrade CANN version to 8.3.rc1.alpha001 (#2903 ) ### What this PR does / why we need it? Upgrade CANN version to 8.3.rc1.alpha001 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.10.2rc2 - vLLM main: `89e08d6d18` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-13 12:10:21 +08:00
Yikun Jiang	138e932630	Bump vLLM version to v0.10.2rc2 (#2902 ) ### What this PR does / why we need it? Upgrade vLLM version to 0.10.2rc2 ### Does this PR introduce _any_ user-facing change? Yes, image will use 0.10.2rc2 vLLM ### How was this patch tested? - vLLM version: main - vLLM main: `f17c075884` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-13 11:39:48 +08:00
Yikun Jiang	6d8bc38c7b	Enable label-based image test and use free runner to run lint (#2864 ) ### What this PR does / why we need it? - Enable label-based image test and use free runner to run lint - soft revert `26f388ba08` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: main - vLLM main: `404c85ca72` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-12 10:49:42 +08:00
rjg-lyh	0005479b9c	[main] mlp weight prefetch in Qwen Dense Models (#2816 ) ### What this PR does / why we need it? This PR prefetchs the weight of mlp layers in Qwen Dense Models to optimize the performance in Decode phase mainly. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI passed with new added/existing test. - vLLM version: main - vLLM main: `a1213fae5f` Signed-off-by: rjg-lyh <1318825571@qq.com> Co-authored-by: Shuming19 <313093131@qq.com>	2025-09-11 21:20:09 +08:00
liziyu	5691104249	LLMdatadist connector adapt the distributed KV aggregation (#2718 ) ### What this PR does / why we need it? LLMdatadist connector adapt the distributed KV aggregation for the main branch. Change the P node from returning "finish sending" only when TP0 responds to returning "finish sending" as soon as each NPU receives it. The D node will send a finish receive signal to the corresponding tp rank of the P node. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? gsm8k test 2*A3 1P 1D P: dp2 tp8 D:dp 4 tp4 P: dp2 tp8 D:dp 2 tp8 - vLLM version: main - vLLM main: `cc99baf14d` Signed-off-by: liziyu <liziyu16@huawei.com>	2025-09-11 11:37:41 +08:00
zhangxinyuehfad	e75b568011	[CI] Update pre_commit runner (#2850 ) ### What this PR does / why we need it? Update pre_commit runner ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: main - vLLM main: `0ae43dbf8c` Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2025-09-10 20:23:25 +08:00
Mengqing Cao	edf1f600ad	[CI] Remove compatibility maintenance for vllm v0.10.1 and v0.10.1.1 (#2840 ) ### What this PR does / why we need it? Remove compatibility maintenance for vllm v0.10.1 and v0.10.1.1 ### Does this PR introduce _any_ user-facing change? branch main of vllm-ascend will not be compatible with vllm v0.10.1 and v0.10.1.1 ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.1.1 - vLLM main: `6fb2788163` --------- Signed-off-by: MengqingCao <cmq0113@163.com>	2025-09-10 08:43:10 +08:00
wangxiyuan	5bcb4c1528	[CI] Reduce CI time (#2801 ) 1. Only run light e2e test before the PR is `ready` to reduce CI time. 2. Run full test once the PR is labled `ready` and `ready for test` 3. Run lint job on self host CPU container to avoid waiting much. - vLLM version: v0.10.1.1 - vLLM main: `6910b56da2` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-09 10:52:14 +08:00
dependabot[bot]	cd88f89267	Bump actions/github-script from 7 to 8 (#2803 ) Bumps [actions/github-script](https://github.com/actions/github-script) from 7 to 8. - vLLM version: v0.10.1.1 - vLLM main: `8c892b1831` Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 14:53:26 +08:00
Yikun Jiang	a58b43b72c	Remove git .extraheader and fecth all commtis in /vllm-workspace/vllm-ascend (#2746 ) ### What this PR does / why we need it? Remove git .extraheader and fecth all commtis in /vllm-workspace/vllm-ascend ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Closes: https://github.com/vllm-project/vllm-ascend/issues/2735 - vLLM version: v0.10.1.1 - vLLM main: `51d5e9be7d` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-09-05 09:45:11 +08:00
Mengqing Cao	984bd7c13a	[Bugfix][APC] Fix accuracy issue on prefix caching with AscendScheduler (#2714 ) ### What this PR does / why we need it? Fix accuracy issue on prefix caching with AscendScheduler ### How was this patch tested? CI passed with `test_prefix_cache_with_ascend_scheduler` - vLLM version: v0.10.1.1 - vLLM main: `6997a25ac6` --------- Signed-off-by: MengqingCao <cmq0113@163.com>	2025-09-04 08:22:46 +08:00
wangxiyuan	41b028aa5f	[Doc] add v0.9.1 release note (#2646 ) Add release note for 0.9.1 - vLLM version: v0.10.1.1 - vLLM main: `8bd5844989` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-03 18:04:27 +08:00
wangxiyuan	c03321781a	[CI] skip unstable UT (#2716 ) See #2687 we notice that test_platform and test_vocab_parallel_embedding is unstable, let's skip them first. Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-03 15:53:50 +08:00
Li Wang	bece793be6	[CI] Disable per-PR triggering for A3 (#2710 ) ### What this PR does / why we need it? Disable per-PR triggering for A3 for now, we trigger the dist test in the label `dist-test` rather than ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.1.1 - vLLM main: `136d853e65` Signed-off-by: wangli <wangli858794774@gmail.com>	2025-09-03 11:52:34 +08:00
wangxiyuan	24d4dad7b2	[CI] Enable MTP torchair e2e test (#2705 ) enable MTP torchair e2e test - vLLM version: v0.10.1.1 - vLLM main: `ce30dca5c4` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-03 08:57:43 +08:00
wangxiyuan	0829b4873f	[CI] recover e2e test (#2688 ) 1. recover the skipped test. 2. remove pangu eager mode test, it's tested by torchair mode already. 3. skip pangu test util the bug is fixed. - vLLM version: v0.10.1.1 - vLLM main: `56d04089ef` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-02 18:49:17 +08:00
yupeng	9f1e054fe3	[Bugfix][LoRA][Operator] Fix LoRA custom operators accuracy issue (#2672 ) ### What this PR does / why we need it? Fix the LoRA accuracy issue that introduced by custom AscendC operator "bgmv_shrink, sgmv_shrink, bgmv_expand, sgmv_epand". The bug details are: - In the kernel function, if you want to call GlobalTensor.GetSize method, you have to pass the second parameter of bufferSize when you call GlobalTensor.SetGlobalBuffer first. - Or GlobalTensor.GetSize method will return a random value. - You can refer to [this doc](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/81RC1alpha002/apiref/ascendcopapi/atlasascendc_api_07_00024.html). ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_ilama_lora.py pytest -sv tests/e2e/multicard/test_ilama_lora_tp2.py - vLLM version: v0.10.1.1 - vLLM main: `a344a5aa0a` --------- Signed-off-by: paulyu12 <paulyu0307@gmail.com> Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: paulyu12 <paulyu0307@gmail.com>	2025-09-02 11:46:59 +08:00
wangxiyuan	fef18b60bc	Refactor e2e CI (#2276 ) Refactor E2E CI to make it clear and faster 1. remove some uesless e2e test 2. remove some uesless function 3. Make sure all test runs with VLLMRunner to avoid oom error 4. Make sure all ops test end with torch.empty_cache to avoid oom error 5. run the test one by one to avoid resource limit error - vLLM version: v0.10.1.1 - vLLM main: `a344a5aa0a` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-02 09:02:22 +08:00
weichen	3a5fc5ee01	[Refactor][MoE] remove redundant code after refactoring fused_moe (#2612 ) ### What this PR does / why we need it? There are a lot of redundant codes related to moe here, and the structure is not very clear. We did the following things： we have placed the relatively independent code related to apply_mlp into a separate file; removed the environment variables of alltoall_buffer and alltoall_seq. Remove the code related to alltoall_buffer and alltoall_seq, and retain the sole TokenDispatcher inheritance class. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e&ut - vLLM version: v0.10.1.1 - vLLM main: `4071c76cf3` --------- Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com> Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: weijinqian0 <12153182+weijinqian0@users.noreply.github.com>	2025-08-30 22:28:50 +08:00
zhangxinyuehfad	e7ad4a64f4	[CI] Add e2e ci test for A3 (#2573 ) ### What this PR does / why we need it? Add e2e ci test for A3 ### How was this patch tested? - vLLM version: v0.10.1.1 - vLLM main: `11a7fafaa8` Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2025-08-29 09:33:42 +08:00
Mengqing Cao	21b5727f9a	[CI] Upgrade vllm in accuracy and performance CI (#2527 ) ### What this PR does / why we need it? Upgrade vllm in accuracy and performance CI ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.1.1 - vLLM main: `5c4b6e66fe` Signed-off-by: MengqingCao <cmq0113@163.com>	2025-08-26 08:49:49 +08:00
Icey	f796e6280b	[CustomOp] Register RotaryEmbedding instead of overwrite forward (#2385 ) ### What this PR does / why we need it? Register RotaryEmbedding instead of overwrite forward ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new added/existing test. - vLLM version: v0.10.0 - vLLM main: `808d2e9aa0` --------- Signed-off-by: Icey <1790571317@qq.com> Signed-off-by: wxsIcey <1790571317@qq.com>	2025-08-25 09:32:35 +08:00
Mengqing Cao	b0403f8d8a	[CI] fix ci (#2464 ) ### What this PR does / why we need it? 1. use action/checkout@v5 instead of v4 2. remove dbo test case because there is issue with it and will be refactored later 3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it 4. fix sampler api changes introduced by https://github.com/vllm-project/vllm/pull/22387 6. fix qwen3 moe config changes intruoduced by https://github.com/vllm-project/vllm/pull/20562 7. fix kvcache block changes introduced by https://github.com/vllm-project/vllm/pull/23262 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: `0c6e40bbaa` --------- Signed-off-by: MengqingCao <cmq0113@163.com>	2025-08-22 07:30:48 +08:00
Yikun Jiang	67a222c383	[Doc] Add feature branch policy (#2432 ) ### What this PR does / why we need it? This patch add the feature branch policy. After this patch: maintainers are allowed to create a feature branch. Feature branches are used for collaboration and must include an RFC link, merge plan and mentor info. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed - vLLM version: v0.10.0 - vLLM main: `7be5d113d8` Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-08-21 10:37:21 +08:00
Mengqing Cao	3a384492e1	[CI] add lint block before running e2e (#2447 ) ### What this PR does / why we need it? add lint block before running e2e. follow up https://github.com/vllm-project/vllm-ascend/pull/2445 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? N/A Signed-off-by: MengqingCao <cmq0113@163.com>	2025-08-20 09:53:23 +08:00
Mengqing Cao	1327f9be1c	Fix some ci issue and refactor modelrunner (#2445 ) ### What this PR does / why we need it? Fix some ci issue and refactor modelrunner ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: `4d9c61993a` --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: weiguihua2 <weiguihua2@huawei.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: weiguihua2 <weiguihua2@huawei.com>	2025-08-20 09:01:04 +08:00
dependabot[bot]	8fb50a4248	Bump actions/checkout from 4 to 5 (#2420 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5. - vLLM version: v0.10.0 - vLLM main: `5f5664b3e4` Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-19 08:54:56 +08:00

1 2 3 4 5

230 Commits