[BugFix] Fix world size bug in model_runner (#2915)
- Fix world size bug in model_runner to make sure ep>16 runs with MC2
- enable e2e test for vl
Co-Authored-By: whx-sjtu <2952154980@qq.com>
Co-Authored-By: Icey <1790571317@qq.com>
- vLLM version: v0.10.2
- vLLM main:
3e903b6cb4
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
@@ -1539,7 +1539,7 @@ class NPUModelRunner(LoRAModelRunnerMixin):
|
||||
if not self.parallel_config.enable_expert_parallel:
|
||||
moe_comm_method = "allgather"
|
||||
elif soc_version in {AscendSocVersion.A2}:
|
||||
if num_tokens <= self.mc2_tokens_capacity and self.parallel_config.world_size >= 16:
|
||||
if num_tokens <= self.mc2_tokens_capacity and self.parallel_config.world_size_across_dp >= 16:
|
||||
moe_comm_method = "mc2"
|
||||
else:
|
||||
moe_comm_method = "allgather"
|
||||
|
||||
Reference in New Issue
Block a user