[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut (#2511)

[Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new
ut

What this PR does / why we need it?
This PR solves the problem of sizes capture and stream error caused by
using ACLgraph on the Qwen3-30B MOE model.
Add new ut.

Does this PR introduce any user-facing change?
no

How was this patch tested?
ut

- vLLM version: v0.10.1.1
- vLLM main:
6fad29b11b

Signed-off-by: lilinsiman <lilinsiman@gmail.com>
This commit is contained in:
lilinsiman
2025-08-26 12:39:21 +08:00
committed by GitHub
parent b3fdd78a6b
commit cfe77e83ae
3 changed files with 80 additions and 7 deletions

View File

@@ -255,6 +255,9 @@ class TestUtils(TestBase):
parallel_config=test_parallel_config,
)
utils.update_aclgraph_sizes(test_vllm_config)
os.environ['HCCL_OP_EXPANSION_MODE'] = 'AIV'
utils.update_aclgraph_sizes(test_vllm_config)
del os.environ['HCCL_OP_EXPANSION_MODE']
self.assertEqual(
147,
len(test_vllm_config.compilation_config.cudagraph_capture_sizes))
@@ -267,6 +270,9 @@ class TestUtils(TestBase):
parallel_config=test_parallel_config,
)
utils.update_aclgraph_sizes(test_vllm_config)
os.environ['HCCL_OP_EXPANSION_MODE'] = 'AIV'
utils.update_aclgraph_sizes(test_vllm_config)
del os.environ['HCCL_OP_EXPANSION_MODE']
self.assertEqual(
3,
len(test_vllm_config.compilation_config.cudagraph_capture_sizes))