Upgrade vllm commit hash to 1216 (#5053)

### What this PR does / why we need it? Upstream vLLM PR #30212 https://github.com/vllm-project/vllm/pull/30212 refactored the attention backend selection interface, This PR adapts vllm-ascend's get_attn_backend_cls to align with the new upstream standard, ensuring compatibility and reducing maintenance overhead. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? co-author:[leo-pony][nengjunma@outlook.com](mailto:nengjunma@outlook.com) - vLLM version: v0.12.0 - vLLM main: ad32e3e19c --------- Signed-off-by: zxwang <1476209578@qq.com> Signed-off-by: leo-pony <nengjunma@outlook.com> Co-authored-by: leo-pony <nengjunma@outlook.com>
2025-12-17 08:48:36 +08:00
parent eb4c08f05d
commit b1a853b0f6
5 changed files with 17 additions and 27 deletions
--- a/tests/e2e/multicard/test_offline_inference_distributed.py
+++ b/tests/e2e/multicard/test_offline_inference_distributed.py
@@ -113,11 +113,9 @@ def test_sp_for_qwen3_moe() -> None:
                    dtype="auto",
                    tensor_parallel_size=2,
                    distributed_executor_backend="mp",
-                    compilation_config={
-                        "pass_config": {
-                            "enable_sequence_parallelism": True
-                        }
-                    },
+                    compilation_config={"pass_config": {
+                        "enable_sp": True
+                    }},
                    enable_expert_parallel=True,
                    enforce_eager=True) as vllm_model:
        vllm_model.generate(example_prompts, sampling_params)