[CustomOp] support TensorList for dispatchFFNCombine (#5665)

### What this PR does / why we need it? To support tensorList for dispatch_ffn_combine, to adjust eplb ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? Single Operator Testing - vLLM version: v0.13.0 - vLLM main: 2f4e6548ef --------- Signed-off-by: lhchg <lhao_cheng@163.com> Co-authored-by: lihaocheng <lihaosheng1@h-partners.com>
2026-01-09 15:56:29 +08:00
parent 3ce5a34468
commit dc99cfdc15
16 changed files with 293 additions and 105 deletions
--- a/vllm_ascend/ops/fused_moe/moe_comm_method.py
+++ b/vllm_ascend/ops/fused_moe/moe_comm_method.py
@@ -306,11 +306,11 @@ class FusedMC2CommImpl(MoECommMethod):
            out = torch.empty_like(hidden_states)
            torch.ops._C_ascend.dispatch_ffn_combine(  # type: ignore
                x=hidden_states,
-                weight1=w1[0],
-                weight2=w2[0],
+                weight1=w1,
+                weight2=w2,
                expert_idx=topk_ids,
-                scale1=w1_scale[0],
-                scale2=w2_scale[0],
+                scale1=w1_scale,
+                scale2=w2_scale,
                probs=topk_weights.to(torch.float32),
                group=self.token_dispatcher.moe_all_to_all_group_name,
                max_output_size=65536,