xc-llm-ascend

Files

LI SHENGYONG 4e6dbe0956 [EPLB][Bugfix] Set parallel_config.enable_eplb to true to load redundant experts (#7470 )

### What this PR does / why we need it?
pr: https://github.com/vllm-project/vllm/pull/37136 break eplb because
it filters out redundant experts.
pr: https://github.com/vllm-project/vllm/pull/37322 fix it due to use
parallel_config.enable_eplb to determine whether to skip the weight
loading filter.
But in vllm-ascend, parallel_config.enable_eplb is always false. When we
use eplb, we temporarily set it to true.

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?

![Snipaste_2026-03-19_16-13-01](https://github.com/user-attachments/assets/b3a4911e-36b3-4c31-951c-7c091f416d00)
| dataset | version | metric | mode | vllm-api-stream-chat |
|----- | ----- | ----- | ----- | -----|
| aime2024 | 604a78 | accuracy | gen | 86.67 |

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>

2026-03-20 15:22:55 +08:00

spec_decode

[Test][Feature] Add e2e test for QuaRot model with eagle3 (#7128 )

2026-03-16 15:35:55 +08:00

test_aclgraph_capture_replay.py

Revert "[CI] fix skiped e2e test when upgrade vllm version (#6654 )" (#7166 )

2026-03-11 23:03:15 +08:00

test_data_parallel.py

Main2main upgrade vllm to 0318 commit (#7412 )

2026-03-19 17:17:36 +08:00

test_disaggregated_encoder.py

Main2main upgrade to vllm 0317 afternoon (#7409 )