Files
xc-llm-ascend/tests
sherie f86596a66c allgather use fusedop. (#2689)
### What this PR does / why we need it?
Use 'npu_moe_init_routing_v2' &'npu_moe_token_unpermute' repalce
'npu_moe_init_routing' &‘npu_moe_compute_expert_tokens’&
'npu_moe_finalize_routing' to optimize performance
### Does this PR introduce _any_ user-facing change?
| branch| tps| TTFT |TPOT |
| --- | --- | --- |--- |
|main  |733.98  | 280.05 |34.30 |
|main+fusedop  | 740.33 | 273.34 |33.99 |
### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
6997a25ac6

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
2025-09-04 11:56:29 +08:00
..
2025-09-04 11:56:29 +08:00
2025-09-04 11:56:29 +08:00