xc-llm-ascend

Files

LI SHENGYONG 4e6dbe0956 [EPLB][Bugfix] Set parallel_config.enable_eplb to true to load redundant experts (#7470 )

### What this PR does / why we need it?
pr: https://github.com/vllm-project/vllm/pull/37136 break eplb because
it filters out redundant experts.
pr: https://github.com/vllm-project/vllm/pull/37322 fix it due to use
parallel_config.enable_eplb to determine whether to skip the weight
loading filter.
But in vllm-ascend, parallel_config.enable_eplb is always false. When we
use eplb, we temporarily set it to true.

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?

![Snipaste_2026-03-19_16-13-01](https://github.com/user-attachments/assets/b3a4911e-36b3-4c31-951c-7c091f416d00)
| dataset | version | metric | mode | vllm-api-stream-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>

2026-03-20 15:22:55 +08:00

2-cards

[EPLB][Bugfix] Set parallel_config.enable_eplb to true to load redundant experts (#7470 )

2026-03-20 15:22:55 +08:00

4-cards

[eagle3][pcp] fix bug for eagle3 and cp enable (#7309 )

2026-03-17 16:14:45 +08:00