Files
xc-llm-ascend/vllm_ascend
linfeng-yuan 79a910ef47 [bugfix][torchair] fix multistream_moe problems in torchair graph mode (#2681)
This pr fixes two problems while `multistream_moe` enabled in torchair
graph mode:
1. check `TorchairAscendW8A8DynamicFusedMoEMethod` instead of incorrect
`AscendW8A8DynamicFusedMoEMethod`
2. mc2_mask should be chunked no matter `replace_allreduce` is True or
False in forward function of `TorchairAscendFusedMoE`

- vLLM version: v0.10.2
- vLLM main:
0fb2551c23

Signed-off-by: linfeng-yuan <1102311262@qq.com>
2025-09-18 17:35:04 +08:00
..
2025-09-18 14:09:19 +08:00