eplb redundant expert bugfix (#4291)

### What this PR does / why we need it?
Redundant experts bugfix
### Does this PR introduce _any_ user-facing change?
After configuring the path for experts_map, users do not need to
configure iinit_redundancy_expert.
### How was this patch tested?
The accuracy of EPLB was tested with and without the use of redundant
experts.


- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

---------

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
This commit is contained in:
LI SHENGYONG
2025-11-21 14:24:35 +08:00
committed by GitHub
parent 5a4e8cdeba
commit 019c7ded91
10 changed files with 63 additions and 140 deletions

View File

@@ -12,6 +12,13 @@ Expert balancing for MoE models in LLM serving is essential for optimal performa
- Adaptive Scaling: Automatically adjusts to workload fluctuations while maintaining stable performance.
- Fault Tolerance: Redundant expert placement ensures system resilience during hardware failures.
## Support Scenarios
### Models:
DeepseekV3/V3.1/R1、Qwen3-MOE
### MOE QuantType:
W8A8-dynamic
## How to Use EPLB
### Dynamic EPLB