[main][Docs] Fix typos across documentation (#6728)

## Summary Fix typos and improve grammar consistency across 50 documentation files. ### Changes include: - Spelling corrections (e.g., "Facotory" → "Factory", "certainty" → "determinism") - Grammar improvements (e.g., "multi-thread" → "multi-threaded", "re-routed" → "re-run") - Punctuation fixes (semicolon consistency in filter parameters) - Code style fixes (correct flag name `--num-prompts` instead of `--num-prompt`) - Capitalization consistency (e.g., "python" → "Python", "ascend" → "Ascend") - vLLM version: v0.15.0 - vLLM main: 9562912cea --------- Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
2026-02-13 15:50:05 +08:00
parent b6bc3d2f9d
commit 6de207de88
50 changed files with 273 additions and 272 deletions
--- a/docs/source/user_guide/configuration/additional_config.md
+++ b/docs/source/user_guide/configuration/additional_config.md
@@ -1,6 +1,6 @@
 # Additional Configuration

-Additional configuration is a mechanism provided by vLLM to allow plugins to control inner behavior by themselves. VLLM Ascend uses this mechanism to make the project more flexible.
+Additional configuration is a mechanism provided by vLLM to allow plugins to control internal behavior by themselves. VLLM Ascend uses this mechanism to make the project more flexible.

 ## How to use

@@ -26,25 +26,25 @@ The following table lists additional configuration options available in vLLM Asc

 | Name                                | Type | Default | Description                                                                                               |
 |-------------------------------------|------|---------|-----------------------------------------------------------------------------------------------------------|
-| `xlite_graph_config`                | dict | `{}`    | Configuration options for xlite graph mode                                                                |
+| `xlite_graph_config`                | dict | `{}`    | Configuration options for Xlite graph mode                                                                |
 | `weight_prefetch_config`            | dict | `{}`    | Configuration options for weight prefetch                                                                 |
 | `finegrained_tp_config`             | dict | `{}`    | Configuration options for module tensor parallelism                                                       |
 | `ascend_compilation_config`         | dict | `{}`    | Configuration options for ascend compilation                                                              |
 | `eplb_config`                       | dict | `{}`    | Configuration options for ascend compilation |
-| `npugraph_ex_config`                | dict | `{}`    | Configuration options for npugraph_ex backend                                                             |
+| `npugraph_ex_config`                | dict | `{}`    | Configuration options for Npugraph_ex backend                                                             |
 | `refresh`                           | bool | `false` | Whether to refresh global Ascend configuration content. This is usually used by rlhf or ut/e2e test case. |
 | `dump_config_path`                  | str  | `None`  | Configuration file path for msprobe dump(eager mode).                                                     |
-| `enable_async_exponential`          | bool | `False` | Whether to enable async exponential overlap. To enable async exponential, set this config to True.        |
+| `enable_async_exponential`          | bool | `False` | Whether to enable asynchronous exponential overlap. To enable asynchronous exponential, set this config to True.        |
 | `enable_shared_expert_dp`           | bool | `False` | When the expert is shared in DP, it delivers better performance but consumes more memory. Currently only DeepSeek series models are supported. |
-| `multistream_overlap_shared_expert` | bool | `False` | Whether to enable multistream shared expert. This option only takes effect on MoE models with shared experts. |
-| `multistream_overlap_gate`          | bool | `False` | Whether to enable multistream overlap gate. This option only takes effect on MoE models with shared experts.  |
+| `multistream_overlap_shared_expert` | bool | `False` | Whether to enable multi-stream shared expert. This option only takes effect on MoE models with shared experts. |
+| `multistream_overlap_gate`          | bool | `False` | Whether to enable multi-stream overlap gate. This option only takes effect on MoE models with shared experts.  |
 | `recompute_scheduler_enable`        | bool | `False` | Whether to enable recompute scheduler.                                                                    |
-| `enable_cpu_binding`                | bool | `False` | Whether to enable CPU binding.                                                                            |
-| `SLO_limits_for_dynamic_batch`      | int  | `-1`    | SLO limits for dynamic batch. This is new scheduler to support dynamic feature                            |
-| `enable_npugraph_ex`                | bool | `False` | Whether to enable npugraph ex graph mode.                                                                 |
+| `enable_cpu_binding`                | bool | `False` | Whether to enable CPU Binding.                                                                            |
+| `SLO_limits_for_dynamic_batch`      | int  | `-1`    | SLO limits for dynamic batch. This is new scheduler to support dynamic batch feature                            |
+| `enable_npugraph_ex`                | bool | `False` | Whether to enable npugraph_ex graph mode.                                                                 |
 | `pa_shape_list`                     | list | `[]`    | The custom shape list of page attention ops.                                                              |
-| `enable_kv_nz`                      | bool | `False` | Whether to enable kvcache NZ layout. This option only takes effects on models using MLA (e.g., DeepSeek).                                      |
-| `layer_sharding` | dict | `{}` | Configuration options for layer sharding linear |
+| `enable_kv_nz`                      | bool | `False` | Whether to enable KV cache NZ layout. This option only takes effects on models using MLA (e.g., DeepSeek).                                      |
+| `layer_sharding` | dict | `{}` | Configuration options for Layer Sharding Linear |

 The details of each configuration option are as follows:

@@ -52,8 +52,8 @@ The details of each configuration option are as follows:

 | Name | Type | Default | Description |
 | ---- | ---- | ------- | ----------- |
-| `enabled` | bool | `False` | Whether to enable xlite graph mode. Currently only Llama, Qwen dense series models, and Qwen3-vl are supported. |
-| `full_mode` | bool | `False` | Whether to enable xlite for both the prefill and decode stages. By default, xlite is only enabled for the decode stage. |
+| `enabled` | bool | `False` | Whether to enable Xlite graph mode. Currently only Llama, Qwen dense series models, and Qwen3-VL are supported. |
+| `full_mode` | bool | `False` | Whether to enable Xlite for both the prefill and decode stages. By default, Xlite is only enabled for the decode stage. |

 **weight_prefetch_config**

@@ -66,8 +66,8 @@ The details of each configuration option are as follows:

 | Name | Type | Default | Description |
 | ---- | ---- | ------- | ----------- |
-| `lmhead_tensor_parallel_size`    | int  | `0` | The custom tensor parallel size of lmhead.    |
-| `oproj_tensor_parallel_size`     | int  | `0` | The custom tensor parallel size of oproj.     |
+| `lmhead_tensor_parallel_size`    | int  | `0` | The custom tensor parallel size of lm_head.    |
+| `oproj_tensor_parallel_size`     | int  | `0` | The custom tensor parallel size of o_proj.     |
 | `embedding_tensor_parallel_size` | int  | `0` | The custom tensor parallel size of embedding. |
 | `mlp_tensor_parallel_size`       | int  | `0` | The custom tensor parallel size of mlp.       |