[Doc][Misc] Comprehensive documentation cleanup and grammatical fixes (#8073)
What this PR does / why we need it? This pull request performs a comprehensive cleanup of the vLLM Ascend documentation. It fixes numerous typos, grammatical errors, and phrasing issues across community guidelines, developer documents, hardware tutorials, and feature guides. Key improvements include correcting hardware names (e.g., Atlas 300I), fixing broken links, cleaning up code examples (removing duplicate flags and trailing commas), and improving the clarity of technical explanations. These changes are necessary to ensure the documentation is professional, accurate, and easy for users to follow. Does this PR introduce any user-facing change? No, this PR contains documentation-only updates. How was this patch tested? The changes were manually reviewed for accuracy and grammatical correctness. No functional code changes were introduced. --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
|
||||
```{note}
|
||||
1. This Atlas 300I series is currently experimental. In future versions, there may be behavioral changes related to model coverage and performance improvement.
|
||||
2. Currently, the 310I series only supports eager mode and the float16 data type.
|
||||
2. Currently, the Atlas 300I series only supports eager mode and the float16 data type.
|
||||
```
|
||||
|
||||
## Run vLLM on Atlas 300I Series
|
||||
@@ -180,7 +180,6 @@ Run the following script (`example.py`) to execute offline inference on NPU:
|
||||
|
||||
```{code-block} python
|
||||
:substitutions:
|
||||
from vllm import LLM, SamplingParams
|
||||
import gc
|
||||
import torch
|
||||
from vllm import LLM, SamplingParams
|
||||
@@ -204,7 +203,7 @@ llm = LLM(
|
||||
tensor_parallel_size=1,
|
||||
max_model_len=4096,
|
||||
enforce_eager=True, # For 300I series, only eager mode is supported.
|
||||
dtype="float16", # IMPORTANT cause some ATB ops cannot support bf16 on 300I series
|
||||
dtype="float16", # IMPORTANT: Some ATB ops do not support bf16 on the 300I series.
|
||||
)
|
||||
# Generate texts from the prompts.
|
||||
outputs = llm.generate(prompts, sampling_params)
|
||||
@@ -247,7 +246,7 @@ llm = LLM(
|
||||
tensor_parallel_size=2,
|
||||
max_model_len=4096,
|
||||
enforce_eager=True, # For 300I series, only eager mode is supported.
|
||||
dtype="float16", # IMPORTANT cause some ATB ops cannot support bf16 on 300I series
|
||||
dtype="float16", # IMPORTANT: Some ATB ops do not support bf16 on the 300I series.
|
||||
)
|
||||
# Generate texts from the prompts.
|
||||
outputs = llm.generate(prompts, sampling_params)
|
||||
@@ -290,7 +289,7 @@ llm = LLM(
|
||||
tensor_parallel_size=1,
|
||||
max_model_len=4096,
|
||||
enforce_eager=True, # For 300I series, only eager mode is supported.
|
||||
dtype="float16", # IMPORTANT cause some ATB ops cannot support bf16 on 300I series
|
||||
dtype="float16", # IMPORTANT: Some ATB ops do not support bf16 on the 300I series.
|
||||
)
|
||||
# Generate texts from the prompts.
|
||||
outputs = llm.generate(prompts, sampling_params)
|
||||
|
||||
Reference in New Issue
Block a user