xc-llm-ascend

Files

LI SHENGYONG 4e6dbe0956 [EPLB][Bugfix] Set parallel_config.enable_eplb to true to load redundant experts (#7470 )

### What this PR does / why we need it?
pr: https://github.com/vllm-project/vllm/pull/37136 break eplb because
it filters out redundant experts.
pr: https://github.com/vllm-project/vllm/pull/37322 fix it due to use
parallel_config.enable_eplb to determine whether to skip the weight
loading filter.
But in vllm-ascend, parallel_config.enable_eplb is always false. When we
use eplb, we temporarily set it to true.

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?

![Snipaste_2026-03-19_16-13-01](https://github.com/user-attachments/assets/b3a4911e-36b3-4c31-951c-7c091f416d00)
| dataset | version | metric | mode | vllm-api-stream-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>

2026-03-20 15:22:55 +08:00

310p

[Lint]Style: Convert test/ to ruff format(Batch #1 ) (#6738 )

2026-03-10 09:52:50 +08:00

doctests

[Doc] Recover installation doc to use pip install (#4109 )

2025-11-11 09:25:44 +08:00

models

[Bugfix] fix bug about model type of qwen3_vl_8b_instruct_w8a8 (#7383 )

2026-03-18 20:30:03 +08:00

multicard

[EPLB][Bugfix] Set parallel_config.enable_eplb to true to load redundant experts (#7470 )