[Bugfix] Fix unsuitable moe_comm_type under ep=1 scenario (#5388)

### What this PR does / why we need it?
This PR aims to fix unsuitable `moe_comm_type` under `ep=1` scenario.
The related issue #5375 have reported that `ep=1` can cause errors in
local environment, but those cases work well on ci. The point is the
difference between machines and `moe_comm_type` may not be chosen
correctly.

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
by ci

- vLLM version: release/v0.13.0
- vLLM main:
bc0a5a0c08

Signed-off-by: Zetong Li <slippersss@126.com>
Co-authored-by: weijinqian0 <1184188277@qq.com>
This commit is contained in:
Zetong Li
2025-12-26 16:45:45 +08:00
committed by GitHub
parent da0b113cf5
commit 09390eaf32

View File

@@ -226,7 +226,8 @@ def select_moe_comm_method(num_tokens: int,
vllm_config.model_config.hf_config, 'moe_quantize',
getattr(vllm_config.model_config.hf_config, 'quantize', None))
if not vllm_config.parallel_config.enable_expert_parallel:
if not vllm_config.parallel_config.enable_expert_parallel or get_ep_group(
).world_size == 1:
moe_comm_type = MoECommType.ALLGATHER
elif soc_version in {AscendDeviceType.A2}:
if (num_tokens <= mc2_tokens_capacity