[BugFix]Modify eplb feature guide. (#3183)

### What this PR does / why we need it?
Revise the EPLB feature guide content.Add eplb params to ascend config.
### Does this PR introduce any user-facing change?
### How was this patch tested?


- vLLM version: v0.10.2
- vLLM main:
52d0cb8458

Co-authored-by: offline0806 <3337230449@qq.com>
This commit is contained in:
offline893
2025-09-25 17:01:51 +08:00
committed by GitHub
parent 07f4710216
commit 5d13bbe796
2 changed files with 90 additions and 35 deletions

View File

@@ -36,6 +36,12 @@ The following table lists the additional configuration options available in vLLM
| `lmhead_tensor_parallel_size` | int | `None` | The custom tensor parallel size of lmhead. |
| `oproj_tensor_parallel_size` | int | `None` | The custom tensor parallel size of oproj. |
| `multistream_overlap_shared_expert`| bool | `False` | Whether to enable multistream shared expert. This option only takes effects on moe models with shared experts. |
| `dynamic_eplb` | bool | `False` | Whether to enable dynamic eplb |
|`num_iterations_eplb_update`| int | `400` | Forward iterations when eplb would begin |
|`gate_eplb`| bool | `False` | Whether to enale eplb only once. |
|`num_wait_worker_iterations`| int | `30` | The forward iterations when eplb worker will finish cpu task. In our test default value 30 would cover most cases. |
|`expert_map_record_path`| str | `None` | When dynamic eplb is completed, save the current expert load heatmap to the specified path. |
|`init_redundancy_expert`| int | `0` |Specify redundant experts during initialization.|
The details of each config option are as follows: