[Bugfix] Support Qwen3-MOE on aclgraph mode (#1381)

### What this PR does / why we need it?
Fix the shape of the `npu_moe_init_routing` input parameters to support
aclgraph mode on qwen3-moe

In addition to this PR, resolving the `gatherv3` error might be
necessary. See related PR
https://github.com/vllm-project/vllm-ascend/pull/1297
https://github.com/vllm-project/vllm-ascend/pull/1446

Thanks to @yiz-liu  for providing the idea

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Tested on Qwen3-30B-A3B

Closes: https://github.com/vllm-project/vllm-ascend/issues/1368

---------

Signed-off-by: ApsarasX <apsarax@outlook.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
This commit is contained in:
ApsarasX
2025-07-06 15:29:36 +08:00
committed by GitHub
parent 14373f65d7
commit c58accc15e
3 changed files with 21 additions and 3 deletions

View File

@@ -29,7 +29,7 @@ from vllm import LLM, SamplingParams
from tests.conftest import VllmRunner
from tests.model_utils import check_outputs_equal
MODELS = ["Qwen/Qwen2.5-0.5B-Instruct"]
MODELS = ["Qwen/Qwen2.5-0.5B-Instruct", "vllm-ascend/Qwen3-30B-A3B-Puring"]
@pytest.mark.skipif(os.getenv("VLLM_USE_V1") == "0",