[Doc][Misc] Comprehensive documentation cleanup and grammatical fixes (#8073)
What this PR does / why we need it? This pull request performs a comprehensive cleanup of the vLLM Ascend documentation. It fixes numerous typos, grammatical errors, and phrasing issues across community guidelines, developer documents, hardware tutorials, and feature guides. Key improvements include correcting hardware names (e.g., Atlas 300I), fixing broken links, cleaning up code examples (removing duplicate flags and trailing commas), and improving the clarity of technical explanations. These changes are necessary to ensure the documentation is professional, accurate, and easy for users to follow. Does this PR introduce any user-facing change? No, this PR contains documentation-only updates. How was this patch tested? The changes were manually reviewed for accuracy and grammatical correctness. No functional code changes were introduced. --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
This commit is contained in:
@@ -668,31 +668,31 @@ Once the preparation is done, you can start the server with the following comman
|
||||
|
||||
1. Prefill node 0
|
||||
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address $node_p0_ip --dp-rpc-port 12880 --vllm-start-port 9300
|
||||
```
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address $node_p0_ip --dp-rpc-port 12880 --vllm-start-port 9300
|
||||
```
|
||||
|
||||
2. Prefill node 1
|
||||
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address $node_p1_ip --dp-rpc-port 12880 --vllm-start-port 9300
|
||||
```
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 2 --tp-size 8 --dp-size-local 2 --dp-rank-start 0 --dp-address $node_p1_ip --dp-rpc-port 12880 --vllm-start-port 9300
|
||||
```
|
||||
|
||||
3. Decode node 0
|
||||
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 8 --tp-size 4 --dp-size-local 4 --dp-rank-start 0 --dp-address $node_d0_ip --dp-rpc-port 12778 --vllm-start-port 9300
|
||||
```
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 8 --tp-size 4 --dp-size-local 4 --dp-rank-start 0 --dp-address $node_d0_ip --dp-rpc-port 12778 --vllm-start-port 9300
|
||||
```
|
||||
|
||||
4. Decode node 1
|
||||
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 8 --tp-size 4 --dp-size-local 4 --dp-rank-start 4 --dp-address $node_d0_ip --dp-rpc-port 12778 --vllm-start-port 9300
|
||||
```
|
||||
```shell
|
||||
# change ip to your own
|
||||
python launch_online_dp.py --dp-size 8 --tp-size 4 --dp-size-local 4 --dp-rank-start 4 --dp-address $node_d0_ip --dp-rpc-port 12778 --vllm-start-port 9300
|
||||
```
|
||||
|
||||
### Request Forwarding
|
||||
|
||||
@@ -722,7 +722,7 @@ python load_balance_proxy_server_example.py \
|
||||
$node_d1_ip \
|
||||
--decoder-ports \
|
||||
9300 9301 9302 9303 \
|
||||
9300 9301 9302 9303 \
|
||||
9300 9301 9302 9303
|
||||
```
|
||||
|
||||
## Functional Verification
|
||||
@@ -763,7 +763,7 @@ Here are two accuracy evaluation methods.
|
||||
|
||||
### Using Language Model Evaluation Harness
|
||||
|
||||
Not test yet.
|
||||
Not tested yet.
|
||||
|
||||
## Performance
|
||||
|
||||
@@ -775,7 +775,7 @@ Refer to [Using AISBench for performance evaluation](../../developer_guide/evalu
|
||||
|
||||
Run performance evaluation of `GLM-4.x` as an example.
|
||||
|
||||
Refer to [vllm benchmark](https://docs.vllm.ai/en/latest/contributing/benchmarks.html) for more details.
|
||||
Refer to [vllm benchmark](https://docs.vllm.ai/en/latest/benchmarking/) for more details.
|
||||
|
||||
There are three `vllm bench` subcommands:
|
||||
|
||||
@@ -802,7 +802,7 @@ vllm bench serve \
|
||||
--port 8000 \
|
||||
--endpoint /v1/completions \
|
||||
--max-concurrency 1 \
|
||||
--request-rate 1 \
|
||||
--request-rate 1
|
||||
```
|
||||
|
||||
After about several minutes, you can get the performance evaluation result.
|
||||
|
||||
Reference in New Issue
Block a user