[BugFix]Fix group list type of mc2. (#3864)

### What this PR does / why we need it?
Fix the precision issue caused by the inconsistency between the group
list type used by mc2 and that of eplb.

- vLLM version: v0.11.0rc3
- vLLM main:
83f478bb19

---------

Signed-off-by: offline0806 <3337230449@qq.com>
This commit is contained in:
offline893
2025-10-30 21:39:01 +08:00
committed by GitHub
parent 655a229455
commit 627f20ce26
3 changed files with 44 additions and 40 deletions

View File

@@ -130,7 +130,8 @@ class MoECommMethod(ABC):
dynamic_scale_for_share=dynamic_scale_for_share,
mc2_mask=mc2_mask,
apply_router_weight_on_input=apply_router_weight_on_input,
with_quant=use_int8_w8a8 or use_int4_w4a8)
with_quant=use_int8_w8a8 or use_int4_w4a8,
dynamic_eplb=dynamic_eplb)
permuted_hidden_states, expert_tokens, dynamic_scale, group_list_type, topk_scales, context_metadata = \
results["hidden_states"], results["group_list"], results.get("dynamic_scale"), results["group_list_type"], results.get("topk_scales"), results.get("context_metadata")