xc-llm-ascend

Author	SHA1	Message	Date
SILONG ZENG	62ea664aa7	[Lint]Style: Convert `test/` to ruff format(Batch #5 ) (#6747 ) ### What this PR does / why we need it? \| File Path \| \| :--- \| \| `tests/e2e/singlecard/compile/backend.py` \| \| `tests/e2e/singlecard/compile/test_graphex_norm_quant_fusion.py` \| \| `tests/e2e/singlecard/compile/test_graphex_qknorm_rope_fusion.py` \| \| `tests/e2e/singlecard/compile/test_norm_quant_fusion.py` \| \| `tests/e2e/singlecard/model_runner_v2/test_basic.py` \| \| `tests/e2e/singlecard/test_aclgraph_accuracy.py` \| \| `tests/e2e/singlecard/test_aclgraph_batch_invariant.py` \| \| `tests/e2e/singlecard/test_aclgraph_mem.py` \| \| `tests/e2e/singlecard/test_async_scheduling.py` \| \| `tests/e2e/singlecard/test_auto_fit_max_mode_len.py` \| \| `tests/e2e/singlecard/test_batch_invariant.py` \| \| `tests/e2e/singlecard/test_camem.py` \| \| `tests/e2e/singlecard/test_completion_with_prompt_embeds.py` \| \| `tests/e2e/singlecard/test_cpu_offloading.py` \| \| `tests/e2e/singlecard/test_guided_decoding.py` \| \| `tests/e2e/singlecard/test_ilama_lora.py` \| \| `tests/e2e/singlecard/test_llama32_lora.py` \| \| `tests/e2e/singlecard/test_models.py` \| \| `tests/e2e/singlecard/test_multistream_overlap_shared_expert.py` \| \| `tests/e2e/singlecard/test_quantization.py` \| \| `tests/e2e/singlecard/test_qwen3_multi_loras.py` \| \| `tests/e2e/singlecard/test_sampler.py` \| \| `tests/e2e/singlecard/test_vlm.py` \| \| `tests/e2e/singlecard/test_xlite.py` \| \| `tests/e2e/singlecard/utils.py` \| ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: `9562912cea` --------- Signed-off-by: MrZ20 <2609716663@qq.com>	2026-02-24 15:50:00 +08:00
zhangyiming	bd4fb871c6	[CI] Add skipped testcases. (#5254 ) ### What this PR does / why we need it? Some E2E testcases are not in our CI workflow, this PR add them back. - vLLM version: release/v0.13.0 - vLLM main: `ad32e3e19c` Signed-off-by: menogrey <1299267905@qq.com>	2025-12-24 10:41:32 +08:00
zhangxinyuehfad	8ae7fca947	[CI] refect e2e ci test (#5246 ) ### What this PR does / why we need it? efect e2e ci test： 1. tests/e2e/singlecard/pooling/test_embedding.py: remove the eager parameter and rename test case 2. tests/e2e/singlecard/pooling/test_scoring.py: Rename test cases 3. tests/e2e/singlecard/pooling/test_classification.py: Rename test case 4. tests/e2e/singlecard/test_quantization.py: remove the eager parameter and chage model to vllm-ascend/Qwen2.5-0.6B-W8A8 and Rename test case 5. tests/e2e/multicard/test_shared_expert_dp.py: Rename test cases 6. tests/e2e/singlecard/test_sampler.py: Rename test cases 7. tests/e2e/singlecard/test_aclgraph_accuracy.py: Rename test cases 8. tests/e2e/multicard/test_offline_inference_distributed.py: Rename test cases and remove the eager parameter 9. tests/e2e/multicard/long_sequence/test_accuracy.py: Rename test cases and remove the eager parameter 10. tests/e2e/multicard/long_sequence/test_basic.py: Rename test cases and remove the eager parameter 11.tests/e2e/multicard/test_expert_parallel.py:remove the eager parameter 12.tests/e2e/multicard/test_full_graph_mode.py:remove the eager parameter 13.tests/e2e/multicard/test_ilama_lora_tp2.py:remove the eager parameter 14.tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_correctness.py:remove the eager parameter 15.tests/e2e/singlecard/spec_decode_v1/test_v1_spec_decode.py:remove the eager parameter 16.tests/e2e/singlecard/test_aclgraph_accuracy.py:remove the eager parameter 17.tests/e2e/singlecard/test_camem.py:remove the eager parameter 18.tests/e2e/singlecard/test_ilama_lora.py:remove the eager parameter 19.tests/e2e/singlecard/test_multistream_overlap_shared_expert.py:remove the eager parameter 20.tests/e2e/singlecard/test_vlm.py:remove the eager parameter 21.tests/e2e/singlecard/test_xli:remove the eager parameter ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: release/v0.13.0 - vLLM main: `ad32e3e19c` Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2025-12-23 18:42:35 +08:00
whx	1b270a64bd	[MoE][Multistream] Avoid performing communication in extra stream. (#3582 ) This PR moves the communication operation of shared experts out of extra stream because I found that this might cause rtMemcpy related errors when running shared experts multistream with aclgraph. Furthermore, I utilize a global variable as extra stream object to avoid allocating streams for each layer in full-graph mode. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: whx-sjtu <2952154980@qq.com>	2025-10-24 10:44:38 +08:00
whx	0a526768f5	[Feature] Support moe multi-stream for aclgraph. (#2946 ) This PR puts the calculation of shared experts into a separate stream, overlaping with routing experts. - vLLM version: v0.10.2 - vLLM main: `fbd6523ac0` --------- Signed-off-by: whx-sjtu <2952154980@qq.com>	2025-09-19 11:06:45 +08:00

5 Commits