[main][Docs] Fix typos across documentation (#6728)
## Summary
Fix typos and improve grammar consistency across 50 documentation files.
### Changes include:
- Spelling corrections (e.g., "Facotory" → "Factory", "certainty" →
"determinism")
- Grammar improvements (e.g., "multi-thread" → "multi-threaded",
"re-routed" → "re-run")
- Punctuation fixes (semicolon consistency in filter parameters)
- Code style fixes (correct flag name `--num-prompts` instead of
`--num-prompt`)
- Capitalization consistency (e.g., "python" → "Python", "ascend" →
"Ascend")
- vLLM version: v0.15.0
- vLLM main:
9562912cea
---------
Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
This commit is contained in:
@@ -16,17 +16,17 @@ Expert balancing for MoE models in LLM serving is essential for optimal performa
|
||||
|
||||
### Models
|
||||
|
||||
DeepseekV3/V3.1/R1、Qwen3-MOE
|
||||
DeepSeekV3/V3.1/R1, Qwen3-MoE
|
||||
|
||||
### MOE QuantType
|
||||
|
||||
W8A8-dynamic
|
||||
W8A8-Dynamic
|
||||
|
||||
## How to Use EPLB
|
||||
|
||||
### Dynamic EPLB
|
||||
|
||||
We need to add environment variable `export DYNAMIC_EPLB="true"` to enable vllm eplb. Enable dynamic balancing with auto-tuned parameters. Adjust expert_heat_collection_interval and algorithm_execution_interval based on workload patterns.
|
||||
We need to add environment variable `export DYNAMIC_EPLB="true"` to enable vLLM EPLB. Enable dynamic balancing with auto-tuned parameters. Adjust expert_heat_collection_interval and algorithm_execution_interval based on workload patterns.
|
||||
|
||||
```shell
|
||||
vllm serve Qwen/Qwen3-235B-A22 \
|
||||
@@ -87,7 +87,7 @@ vllm serve Qwen/Qwen3-235B-A22 \
|
||||
|
||||
4. Monitoring & Validation:
|
||||
- Track metrics: expert_load_balance_ratio, ttft_p99, tpot_avg, and gpu_utilization.
|
||||
- Use vllm monitor to detect imbalances during runtime.
|
||||
- Use vLLM monitor to detect imbalances during runtime.
|
||||
- Always verify expert map JSON structure before loading (validate with jq or similar tools).
|
||||
|
||||
5. Startup Behavior:
|
||||
|
||||
Reference in New Issue
Block a user