[Bugfix] Fix acc bug when enbale dispatch_gmm_combine_decode and eplb (#5806)

### What this PR does / why we need it? Fix acc bug when enbale dispatch_gmm_combine_decode and eplb. After eplb, expert table may change, so mapping is needed, while fused_mc2 miss the mapping. More info about this operator, please refer to RFC: issue https://github.com/vllm-project/vllm-ascend/issues/5476 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? without this pr, qwen3-235b eplb with dispatch_gmm_combine_decode get acc 3.33% on aime2024. with this pr, test qwen3-235b eplb on a single A3 node(ep16) without dispatch_gmm_combine_decode | dataset | version | metric | mode | vllm-api-stream-chat | |----- | ----- | ----- | ----- | -----| | aime2024 | 604a78 | accuracy | gen | 86.67 | with dispatch_gmm_combine_decode | dataset | version | metric | mode | vllm-api-stream-chat | |----- | ----- | ----- | ----- | -----| | aime2024 | 604a78 | accuracy | gen | 86.67 | - vLLM version: v0.13.0 - vLLM main: 2f4e6548ef Signed-off-by: wangqiankun <wangqiankun13@huawei.com>
2026-01-15 09:21:18 +08:00
parent 7078dff691
commit d840f153f4
1 changed files with 6 additions and 1 deletions
--- a/vllm_ascend/ops/fused_moe/moe_comm_method.py
+++ b/vllm_ascend/ops/fused_moe/moe_comm_method.py
@@ -300,6 +300,11 @@ class FusedMC2CommImpl(MoECommMethod):

        assert isinstance(self.token_dispatcher, TokenDispatcherWithMC2), \
            "token_dispatcher must be an instance of TokenDispatcherWithMC2."
+
+        # Apply log2phy if needed
+        if log2phy is not None:
+            topk_ids = log2phy[topk_ids]
+
        group_list_type = None
        expert_tokens = None
        if envs_ascend.VLLM_ASCEND_ENABLE_FUSED_MC2 == 1:
@@ -331,7 +336,7 @@ class FusedMC2CommImpl(MoECommMethod):
                group_ep=self.token_dispatcher.moe_all_to_all_group_name,
                ep_rank_size=self.token_dispatcher.ep_world_size,
                ep_rank_id=self.token_dispatcher.ep_rank_id,
-                moe_expert_num=len(expert_map),
+                moe_expert_num=self.moe_config.num_experts,
                global_bs=self.token_dispatcher.fused_global_bs)
        else:
            raise ValueError(