[smoke][bugfix] moe_init_routing_v2 active_expert_range use int type (#5521)

### What this PR does / why we need it?
The float kernel of MOE_init_routing_v2 in the dispatch allgather
operation does not support tensor format for active_expert_range; it
only supports int.
PR5311 To unify the variables `local_num_experts` and
`self.local_num_experts`, `self.local_num_experts` was used
consistently, which led to the subsequent integer type parameter being
converted to a tensor type.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
gsm8k | exact_match,strict-match: ground_truth=0.89 | measured=0.8939 |
success=
gsm8k | exact_match,flexible-extract: ground_truth=0.85 | measured=0.856
| success=
ceval-valid | acc,none: ground_truth=0.84 | measured=0.8373 | success=
Model Parameters:
{'pretrained': 'Qwen/Qwen3-30B-A3B', 'tensor_parallel_size': 2, 'dtype':
'auto', 'trust_remote_code': False, 'max_model_len': 4096,
'gpu_memory_utilization': 0.6, 'enable_expert_parallel': True}

- vLLM version: v0.13.0
- vLLM main:
45c1ca1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
This commit is contained in:
LI SHENGYONG
2025-12-31 09:19:04 +08:00
committed by GitHub
parent 2ee17e50a1
commit bdc721d35a
3 changed files with 4 additions and 3 deletions

View File

@@ -180,7 +180,7 @@ class AscendFusedMoE(FusedMoE):
or ascend_config.expert_map_record_path) and (
self.log2phy is not None)
self.local_num_experts = (torch.sum(
self._expert_map != -1) if self._expert_map is not None else
self._expert_map != -1).item() if self._expert_map is not None else
self.global_num_experts)
if self._expert_map is not None:
logger.info_once(

View File

@@ -335,7 +335,9 @@ class TokenDispatcherWithAllGather(MoETokenDispatcher):
super().__init__(**kwargs)
self.apply_router_weight_on_input = False
self.max_num_tokens = kwargs.get("max_num_tokens")
self.num_experts_local = kwargs.get("num_local_experts", 0)
num_experts_local = kwargs.get("num_local_experts", 0)
self.num_experts_local = num_experts_local.item() if torch.is_tensor(
num_experts_local) else int(num_experts_local)
self.original_shape = None
self.with_quant = False