[EPLB][Nightly][Bugfix] Get expert from moe layer only (#5908)

### What this PR does / why we need it? 1. If the model has dense layers, the current code will attempt to obtain the routing experts of the dense layers, which will cause an error. This should be fixed by modifying the code to skip the dense layers when obtaining the routing experts. 2. The global_expert_map that the function directly outputs a affects the performance of dsv3.2. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? DeepSeek V3.1 conversation is normal. #### aime precision test (dsv3.1) baseline without eplb | dataset | version | metric | mode | vllm-api-general-chat | |----- | ----- | ----- | ----- | -----| | aime2024 | 604a78 | accuracy | gen | 66.67 | eplb | dataset | version | metric | mode | vllm-api-general-chat | |----- | ----- | ----- | ----- | -----| | aime2024 | 604a78 | accuracy | gen | 70.00 | - vLLM version: v0.13.0 - vLLM main: 11b6af5280 Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
2026-01-19 09:23:28 +08:00
parent ad3a1eaf70
commit 9fed2636cb
4 changed files with 12 additions and 13 deletions
--- a/vllm_ascend/ops/fused_moe/fused_moe.py
+++ b/vllm_ascend/ops/fused_moe/fused_moe.py
@@ -202,10 +202,8 @@ class AscendFusedMoE(FusedMoE):

        # init moe
        eplb_config = ascend_config.eplb_config
-        self.global_expert_map, self.log2phy, self.global_redundant_expert_num = init_eplb_config(
+        self.global_expert_map, self._expert_map, self.log2phy, self.global_redundant_expert_num = init_eplb_config(
            eplb_config, self.moe_instance_id, self.moe_config)
-        if self.global_expert_map is not None:
-            self._expert_map = self.global_expert_map[self.ep_rank].npu()
        self.global_num_experts = num_experts + self.global_redundant_expert_num
        self.dynamic_eplb = eplb_config.dynamic_eplb and (self.log2phy
                                                          is not None)