[EPLB]Record expert map without dynamic eplb. (#3409)

What this PR does / why we need it? 1.Record expert map without dynamic eplb. 2.Add export PYTHONOPTIMIZE=1 when using dynamic eplb. 3.change eplb doc Does this PR introduce any user-facing change? How was this patch tested? Qwen3_moe in A3. - vLLM version: v0.11.0 --------- Signed-off-by: offline0806 <3337230449@qq.com> Co-authored-by: offline0806 <3337230449@qq.com>
2025-10-15 14:21:15 +08:00
parent 4f937f561d
commit 5a3082cd15
9 changed files with 49 additions and 15 deletions
--- a/docs/source/user_guide/feature_guide/eplb_swift_balancer.md
+++ b/docs/source/user_guide/feature_guide/eplb_swift_balancer.md
@@ -16,7 +16,7 @@ Expert balancing for MoE models in LLM serving is essential for optimal performa

 ### Dynamic EPLB

-Enable dynamic balancing with auto-tuned parameters. Adjust num_iterations_eplb_update and num_wait_worker_iterations based on workload patterns.
+We need to add environment variable `export PYTHONOPTIMIZE=1` to get context of vllm process. Enable dynamic balancing with auto-tuned parameters. Adjust num_iterations_eplb_update and num_wait_worker_iterations based on workload patterns.

 ```shell
 vllm serve Qwen/Qwen3-235B-A22 \
@@ -25,7 +25,6 @@ vllm serve Qwen/Qwen3-235B-A22 \
  --additional-config '{
    "dynamic_eplb": true,
    "num_iterations_eplb_update": 400,
-    "gate_eplb": true,
    "num_wait_worker_iterations": 30
  }'
 ```
@@ -42,9 +41,7 @@ vllm serve Qwen/Qwen3-235B-A22 \
  --additional-config '{
    "expert_map_record_path": "/path/to/eplb.json",
    "init_redundancy_expert": 16,
-    "dynamic_eplb": true,
    "num_iterations_eplb_update": 400,
-    "gate_eplb": true,
    "num_wait_worker_iterations": 30
  }'
 ```