xc-llm-ascend

Files

yiz-liu 83eb40a51c [Fix][MoE] Refine MoE communication strategy (#2734 )

### What this PR does / why we need it?
Refactors the Mixture-of-Experts (MoE) communication method selection
logic. The choice between all-gather, all-to-all, and mc2 is now
determined by expert parallel configuration, SoC version (A2/A3), and
token count for better performance.

### Does this PR introduce _any_ user-facing change?
None.

### How was this patch tested?
Added.


- vLLM version: v0.10.1.1
- vLLM main:
eafa8dcde6

---------

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

2025-09-05 09:04:04 +08:00

test_input_batch.py

[CI] Fix broken ci (#2530 )

2025-08-26 07:42:24 +08:00

test_model_runner_v1.py

[Fix][MoE] Refine MoE communication strategy (#2734 )

2025-09-05 09:04:04 +08:00

test_worker_v1.py

[Main][Feat]Set the Profiler parameters through environment variables consistent with vLLM (#2608 )

2025-09-03 10:58:08 +08:00