[Bugfix][MoE] Remove All2All in w4a8_dynamic (#4977)

### What this PR does / why we need it?
GatherEP has been fixed in
https://github.com/vllm-project/vllm-ascend/pull/3279, remove all2all in
w4a8_dynamic scenario.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
e2e & ut
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: weichen <calvin_zhu0210@outlook.com>
This commit is contained in:
weichen
2025-12-17 17:39:57 +08:00
committed by GitHub
parent 97537709ae
commit 7f1e93f185

View File

@@ -248,10 +248,6 @@ def select_moe_comm_method(num_tokens: int,
and vllm_config.parallel_config.world_size_across_dp / and vllm_config.parallel_config.world_size_across_dp /
vllm_config.parallel_config.pipeline_parallel_size >= 16): vllm_config.parallel_config.pipeline_parallel_size >= 16):
moe_comm_type = MoECommType.MC2 moe_comm_type = MoECommType.MC2
else:
# Currently, w4a8_dynamic does not support allgatherep
if quant_type == "w4a8_dynamic":
moe_comm_type = MoECommType.ALLTOALL
else: else:
moe_comm_type = MoECommType.ALLGATHER moe_comm_type = MoECommType.ALLGATHER