[BugFix] Fix world size bug in model_runner (#2915)

- Fix world size bug in model_runner to make sure ep>16 runs with MC2 
- enable e2e test for vl

Co-Authored-By: whx-sjtu <2952154980@qq.com>
Co-Authored-By: Icey <1790571317@qq.com>
- vLLM version: v0.10.2
- vLLM main:
3e903b6cb4

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
wangxiyuan
2025-09-14 12:20:25 +08:00
committed by GitHub
parent c5a502fd2e
commit 382c29f3e1
3 changed files with 13 additions and 8 deletions

View File

@@ -1539,7 +1539,7 @@ class NPUModelRunner(LoRAModelRunnerMixin):
if not self.parallel_config.enable_expert_parallel:
moe_comm_method = "allgather"
elif soc_version in {AscendSocVersion.A2}:
if num_tokens <= self.mc2_tokens_capacity and self.parallel_config.world_size >= 16:
if num_tokens <= self.mc2_tokens_capacity and self.parallel_config.world_size_across_dp >= 16:
moe_comm_method = "mc2"
else:
moe_comm_method = "allgather"