[main][Docs] Fix spelling errors across documentation (#6649)

Fix various spelling mistakes in the project documentation to improve
clarity and correctness.
- vLLM version: v0.15.0
- vLLM main:
d7e17aaacd

---------

Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
This commit is contained in:
Cao Yi
2026-02-10 11:14:57 +08:00
committed by GitHub
parent 5b8e47cb68
commit 1c7d1163f5
30 changed files with 67 additions and 67 deletions

View File

@@ -109,7 +109,7 @@ if self.speculative_config:
got {self.decode_threshold}"
```
## Limitation
## Limitations
- Due to the fact that only a single layer of weights is exposed in DeepSeek's MTP, the accuracy and performance are not effectively guaranteed in scenarios where MTP > 1 (especially MTP ≥ 3). Moreover, due to current operator limitations, MTP supports a maximum of 15.
- In the fullgraph mode with MTP > 1, the capture size of each aclgraph must be an integer multiple of (num_speculative_tokens + 1).

View File

@@ -29,7 +29,7 @@ However, this may not be the optimal configuration for your scenario. For higher
Notices:
1) Weight prefetch of MLP `down` project prefetch dependence sequence parallel, if you want open for mlp `down` please also enable sequence parallel.
2) Due to the current size of the L2 cache, the maximum prefetch cannot exceed 18MB. If `prefetch_ration * lineaer_layer_weight_size >= 18 * 1024 * 1024` bytes, the backend will only prefetch 18MB.
2) Due to the current size of the L2 cache, the maximum prefetch cannot exceed 18MB. If `prefetch_ratio * linear_layer_weight_size >= 18 * 1024 * 1024` bytes, the backend will only prefetch 18MB.
## Example

View File

@@ -341,7 +341,7 @@ This is the first release candidate of v0.12.0 for vLLM Ascend. We landed lots o
- [Experimental] [KV cache pool](https://docs.vllm.ai/projects/ascend/en/latest/developer_guide/feature_guide/KV_Cache_Pool_Guide.html) feature is added
- [Experimental] A new graph mode `xlite` is introduced. It performs good with some models. Following the [official tutorial](https://docs.vllm.ai/projects/ascend/en/latest/user_guide/feature_guide/graph_mode.html#using-xlitegraph) to start using it.
- LLMdatadist kv connector is removed. Please use mooncake connector instead.
- Ascend scheduler is removed. `--additional-config {"ascend_scheudler": {"enabled": true}` doesn't work anymore.
- Ascend scheduler is removed. `--additional-config {"ascend_scheduler": {"enabled": true}` doesn't work anymore.
- Torchair graph mode is removed. `--additional-config {"torchair_graph_config": {"enabled": true}}` doesn't work anymore. Please use aclgraph instead.
- `VLLM_ASCEND_ENABLE_TOPK_TOPP_OPTIMIZATION` env is removed. This feature is stable enough. We enable it by default now.
- speculative decode method `Ngram` is back now.
@@ -637,7 +637,7 @@ This is the 3rd release candidate of v0.9.1 for vLLM Ascend. Please follow the [
- Fix header include issue in rope [#2398](https://github.com/vllm-project/vllm-ascend/pull/2398)
- Fix mtp config bug [#2412](https://github.com/vllm-project/vllm-ascend/pull/2412)
- Fix error info and adapt `attn_metedata` refactor [#2402](https://github.com/vllm-project/vllm-ascend/pull/2402)
- Fix torchair runtime error caused by configuration mismtaches and `.kv_cache_bytes` file missing [#2312](https://github.com/vllm-project/vllm-ascend/pull/2312)
- Fix torchair runtime error caused by configuration mismatches and `.kv_cache_bytes` file missing [#2312](https://github.com/vllm-project/vllm-ascend/pull/2312)
- Move `with_prefill` allreduce from cpu to npu [#2230](https://github.com/vllm-project/vllm-ascend/pull/2230)
### Docs