[Qwen-moe] Remove the minor operation arange (#2373)

### What this PR does / why we need it?
Integrate the arange operator to reduce the time spent and improve
performance

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- vLLM version: v0.10.1.1
- vLLM main:
56dcf4e7e9

---------

Signed-off-by: s30076806 <songjiayang2@h-partners.com>
This commit is contained in:
s30076806
2025-08-27 09:13:31 +08:00
committed by GitHub
parent 358ba68994
commit 6a4ec186e7
9 changed files with 80 additions and 79 deletions

View File

@@ -130,7 +130,7 @@ def forward_oot(
logical_to_physical_map: Optional[torch.Tensor] = None,
logical_replica_count: Optional[torch.Tensor] = None) -> torch.Tensor:
topk_weights, topk_ids = select_experts(
topk_weights, topk_ids, _ = select_experts(
hidden_states=x,
router_logits=router_logits,
top_k=top_k,