[Doc] EPLB update the documentation (#8795)
### What this PR does / why we need it?
1. Update documentation:In the current version, we recommend using the
following: policy of swift balancer(2).
### Does this PR introduce _any_ user-facing change?
--additional-config '{ "eplb_config": {
"dynamic_eplb": true,
"expert_heat_collection_interval": 600,
"algorithm_execution_interval": 50,
"eplb_policy_type": 2,
"num_redundant_experts": {ep_size}
}}'
### How was this patch tested?
Test in DSV3.1 EP32
Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
This commit is contained in:
@@ -26,7 +26,7 @@ W8A8-Dynamic
|
|||||||
|
|
||||||
### Dynamic EPLB
|
### Dynamic EPLB
|
||||||
|
|
||||||
We need to add environment variable `export DYNAMIC_EPLB="true"` to enable vLLM EPLB. Enable dynamic balancing with auto-tuned parameters. Adjust expert_heat_collection_interval and algorithm_execution_interval based on workload patterns.
|
We need to add environment variable `export DYNAMIC_EPLB="true"` to enable vLLM EPLB. Enable dynamic balancing with auto-tuned parameters. Adjust expert_heat_collection_interval and algorithm_execution_interval based on workload patterns. In the current version, we recommend using the following: policy of swift balancer(2).
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
vllm serve Qwen/Qwen3-235B-A22 \
|
vllm serve Qwen/Qwen3-235B-A22 \
|
||||||
@@ -34,8 +34,10 @@ vllm serve Qwen/Qwen3-235B-A22 \
|
|||||||
--enable-expert-parallel \
|
--enable-expert-parallel \
|
||||||
--additional-config '{ "eplb_config": {
|
--additional-config '{ "eplb_config": {
|
||||||
"dynamic_eplb": true,
|
"dynamic_eplb": true,
|
||||||
"expert_heat_collection_interval": 400,
|
"expert_heat_collection_interval": 600,
|
||||||
"algorithm_execution_interval": 30
|
"algorithm_execution_interval": 50,
|
||||||
|
"eplb_policy_type": 2,
|
||||||
|
"num_redundant_experts": {ep_size},
|
||||||
}}'
|
}}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user