xc-llm-ascend

Author	SHA1	Message	Date
SILONG ZENG	62ea664aa7	[Lint]Style: Convert `test/` to ruff format(Batch #5 ) (#6747 ) ### What this PR does / why we need it? \| File Path \| \| :--- \| \| `tests/e2e/singlecard/compile/backend.py` \| \| `tests/e2e/singlecard/compile/test_graphex_norm_quant_fusion.py` \| \| `tests/e2e/singlecard/compile/test_graphex_qknorm_rope_fusion.py` \| \| `tests/e2e/singlecard/compile/test_norm_quant_fusion.py` \| \| `tests/e2e/singlecard/model_runner_v2/test_basic.py` \| \| `tests/e2e/singlecard/test_aclgraph_accuracy.py` \| \| `tests/e2e/singlecard/test_aclgraph_batch_invariant.py` \| \| `tests/e2e/singlecard/test_aclgraph_mem.py` \| \| `tests/e2e/singlecard/test_async_scheduling.py` \| \| `tests/e2e/singlecard/test_auto_fit_max_mode_len.py` \| \| `tests/e2e/singlecard/test_batch_invariant.py` \| \| `tests/e2e/singlecard/test_camem.py` \| \| `tests/e2e/singlecard/test_completion_with_prompt_embeds.py` \| \| `tests/e2e/singlecard/test_cpu_offloading.py` \| \| `tests/e2e/singlecard/test_guided_decoding.py` \| \| `tests/e2e/singlecard/test_ilama_lora.py` \| \| `tests/e2e/singlecard/test_llama32_lora.py` \| \| `tests/e2e/singlecard/test_models.py` \| \| `tests/e2e/singlecard/test_multistream_overlap_shared_expert.py` \| \| `tests/e2e/singlecard/test_quantization.py` \| \| `tests/e2e/singlecard/test_qwen3_multi_loras.py` \| \| `tests/e2e/singlecard/test_sampler.py` \| \| `tests/e2e/singlecard/test_vlm.py` \| \| `tests/e2e/singlecard/test_xlite.py` \| \| `tests/e2e/singlecard/utils.py` \| ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: `9562912cea` --------- Signed-off-by: MrZ20 <2609716663@qq.com>	2026-02-24 15:50:00 +08:00
zhangyiming	45c5bcd962	[E2E] Optimize the E2E test time. (#5294 ) ### What this PR does / why we need it? Add cudagraph_capture_sizes for E2E CI test. - vLLM version: release/v0.13.0 - vLLM main: `ad32e3e19c` Signed-off-by: menogrey <1299267905@qq.com>	2025-12-26 14:17:50 +08:00
zhangxinyuehfad	8ae7fca947	[CI] refect e2e ci test (#5246 ) ### What this PR does / why we need it? efect e2e ci test： 1. tests/e2e/singlecard/pooling/test_embedding.py: remove the eager parameter and rename test case 2. tests/e2e/singlecard/pooling/test_scoring.py: Rename test cases 3. tests/e2e/singlecard/pooling/test_classification.py: Rename test case 4. tests/e2e/singlecard/test_quantization.py: remove the eager parameter and chage model to vllm-ascend/Qwen2.5-0.6B-W8A8 and Rename test case 5. tests/e2e/multicard/test_shared_expert_dp.py: Rename test cases 6. tests/e2e/singlecard/test_sampler.py: Rename test cases 7. tests/e2e/singlecard/test_aclgraph_accuracy.py: Rename test cases 8. tests/e2e/multicard/test_offline_inference_distributed.py: Rename test cases and remove the eager parameter 9. tests/e2e/multicard/long_sequence/test_accuracy.py: Rename test cases and remove the eager parameter 10. tests/e2e/multicard/long_sequence/test_basic.py: Rename test cases and remove the eager parameter 11.tests/e2e/multicard/test_expert_parallel.py:remove the eager parameter 12.tests/e2e/multicard/test_full_graph_mode.py:remove the eager parameter 13.tests/e2e/multicard/test_ilama_lora_tp2.py:remove the eager parameter 14.tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_correctness.py:remove the eager parameter 15.tests/e2e/singlecard/spec_decode_v1/test_v1_spec_decode.py:remove the eager parameter 16.tests/e2e/singlecard/test_aclgraph_accuracy.py:remove the eager parameter 17.tests/e2e/singlecard/test_camem.py:remove the eager parameter 18.tests/e2e/singlecard/test_ilama_lora.py:remove the eager parameter 19.tests/e2e/singlecard/test_multistream_overlap_shared_expert.py:remove the eager parameter 20.tests/e2e/singlecard/test_vlm.py:remove the eager parameter 21.tests/e2e/singlecard/test_xli:remove the eager parameter ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: release/v0.13.0 - vLLM main: `ad32e3e19c` Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2025-12-23 18:42:35 +08:00
Li Wang	5d1f6daef6	[CI] Mock spawn for vlm tests (#5279 ) ### What this PR does / why we need it? Using `spawn` in continuous testing scenarios ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: release/v0.13.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-12-23 18:35:06 +08:00
Li Wang	9a79cbaecb	[ModelRunner] Add hunyuan-vl basic support (#5151 ) ### What this PR does / why we need it? This patch add handling of `XDRotaryEmbedding` in modelrunner to support for `hunyuan-vl` ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? CI passed with added/exist tests Closes: https://github.com/vllm-project/vllm-ascend/issues/4992 - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-12-23 10:46:54 +08:00
shaopeng-666	39bdd4cfaa	fix profile run for vl model (#5136 ) ### What this PR does / why we need it? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>	2025-12-17 23:51:31 +08:00
zhangyiming	66b0781840	[E2E] Refactor the e2e testcases. (#4789 ) ### What this PR does / why we need it? Refactor the e2e testcases. - tests/e2e/multicard/test_weight_loader.py: Remove the unused code. - tests/e2e/singlecard/multi-modal/test_internvl.py: Move to accuracy test. - tests/e2e/singlecard/test_aclgraph.py: Rename the file. - tests/e2e/singlecard/test_embedding_aclgraph.py : Combine with tests/e2e/singlecard/test_bge_model.py - tests/e2e/singlecard/test_completion_with_prompt_embeds.py: Delete eager mode and modify model to Qwen3-0.6B - tests/e2e/singlecard/test_quantization.py: Modify model to Qwen3-0.6B-W8A8 - tests/e2e/singlecard/test_vlm.py: Modify model to Qwen3-VL-8B - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: menogrey <1299267905@qq.com>	2025-12-11 10:15:00 +08:00
wangxiyuan	27b09ca9b9	[CI] drop ascend scheduler test (#4582 ) let' drop ascend scheduler test first to ensure all function works without it. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-01 20:33:50 +08:00
Mengqing Cao	517fd9272d	Revert "drop ascend scheduler" (#4580 ) Reverts vllm-project/vllm-ascend#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2	2025-11-29 22:20:48 +08:00
wangxiyuan	f10acddb78	drop ascend scheduler (#4498 ) Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-29 16:18:34 +08:00
whx	d5609e2c48	[BugFix] Comment out newly added vlm e2e. (#3736 ) This PR comments out newly added vlm e2e test of ascend scheduler scenario because I found that when running in multi-batch this will stuck. Need to add this back after dealing with this issue. - vLLM version: v0.11.0rc3 - vLLM main: `17c540a993` Signed-off-by: whx-sjtu <2952154980@qq.com>	2025-10-25 10:34:59 +08:00
whx	e33751ef8b	[BugFix][Core] Fix a bug running multi-modal with ascend_scheduler (#3675 ) This PR fix the bug related with running multi-modal models with AscendScheduler. This bug was introduced by PR #2372 by using the same parameter names as vLLM with different default values. Currently I fix this bug by changing the default values of these two parameters to align with vLLM. - vLLM version: v0.11.0rc3 - vLLM main: `17c540a993` Signed-off-by: hw_whx <wanghexiang7@huawei.com> Co-authored-by: hw_whx <wanghexiang7@huawei.com>	2025-10-25 09:41:33 +08:00
lilinsiman	1b424fb7f1	ACLgraph enable: Test cases revisions for all features (#3388 ) ### What this PR does / why we need it? This PR revise the test cases of various features on the warehouse which add the enablement of aclgraph to the test cases. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ut - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: lilinsiman <lilinsiman@gmail.com>	2025-10-17 17:15:19 +08:00
wangxiyuan	c73dd8fecb	[CI] Fix CI by addressing max_split_size_mb config (#3258 ) ### What this PR does / why we need it? Fix CI by addressing max_split_size_mb config ### Does this PR introduce _any_ user-facing change? No, test onyl ### How was this patch tested? Full CI passed, espcially eagle one - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-29 14:05:12 +08:00
wangxiyuan	382c29f3e1	[BugFix] Fix world size bug in model_runner (#2915 ) - Fix world size bug in model_runner to make sure ep>16 runs with MC2 - enable e2e test for vl Co-Authored-By: whx-sjtu <2952154980@qq.com> Co-Authored-By: Icey <1790571317@qq.com> - vLLM version: v0.10.2 - vLLM main: `3e903b6cb4` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-14 12:20:25 +08:00
wangxiyuan	fef18b60bc	Refactor e2e CI (#2276 ) Refactor E2E CI to make it clear and faster 1. remove some uesless e2e test 2. remove some uesless function 3. Make sure all test runs with VLLMRunner to avoid oom error 4. Make sure all ops test end with torch.empty_cache to avoid oom error 5. run the test one by one to avoid resource limit error - vLLM version: v0.10.1.1 - vLLM main: `a344a5aa0a` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-09-02 09:02:22 +08:00

16 Commits