[Bugfix] Fix unsuitable moe_comm_type under ep=1 scenario (#5388)
### What this PR does / why we need it?
This PR aims to fix unsuitable `moe_comm_type` under `ep=1` scenario.
The related issue #5375 have reported that `ep=1` can cause errors in
local environment, but those cases work well on ci. The point is the
difference between machines and `moe_comm_type` may not be chosen
correctly.
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
by ci
- vLLM version: release/v0.13.0
- vLLM main:
bc0a5a0c08
Signed-off-by: Zetong Li <slippersss@126.com>
Co-authored-by: weijinqian0 <1184188277@qq.com>
This commit is contained in:
@@ -226,7 +226,8 @@ def select_moe_comm_method(num_tokens: int,
|
|||||||
vllm_config.model_config.hf_config, 'moe_quantize',
|
vllm_config.model_config.hf_config, 'moe_quantize',
|
||||||
getattr(vllm_config.model_config.hf_config, 'quantize', None))
|
getattr(vllm_config.model_config.hf_config, 'quantize', None))
|
||||||
|
|
||||||
if not vllm_config.parallel_config.enable_expert_parallel:
|
if not vllm_config.parallel_config.enable_expert_parallel or get_ep_group(
|
||||||
|
).world_size == 1:
|
||||||
moe_comm_type = MoECommType.ALLGATHER
|
moe_comm_type = MoECommType.ALLGATHER
|
||||||
elif soc_version in {AscendDeviceType.A2}:
|
elif soc_version in {AscendDeviceType.A2}:
|
||||||
if (num_tokens <= mc2_tokens_capacity
|
if (num_tokens <= mc2_tokens_capacity
|
||||||
|
|||||||
Reference in New Issue
Block a user