[Bugfix] Fix mc2 operator error in aclgraph + ep<16 scenario (#2609)
### What this PR does / why we need it?
1. quickfix mc2 operator error in aclgraph + ep<16 scenario to recover
CI, will be refactorred in the future
2. disable aclgraph when testing w8a8
### How was this patch tested?
CI passed with existing test.
- vLLM version: v0.10.1.1
- vLLM main:
95089607fa
Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
@@ -55,6 +55,7 @@ def test_models_distributed_Qwen3_MOE_TP2_WITH_EP():
|
||||
tensor_parallel_size=2,
|
||||
enable_expert_parallel=True,
|
||||
distributed_executor_backend="mp",
|
||||
enforce_eager=False,
|
||||
) as vllm_model:
|
||||
vllm_model.generate_greedy(example_prompts, max_tokens)
|
||||
|
||||
@@ -71,7 +72,7 @@ def test_models_distributed_Qwen3_MOE_W8A8():
|
||||
dtype=dtype,
|
||||
tensor_parallel_size=2,
|
||||
quantization="ascend",
|
||||
enforce_eager=False,
|
||||
enforce_eager=True,
|
||||
) as vllm_model:
|
||||
vllm_model.generate_greedy(example_prompts, max_tokens)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user