[Doc][v0.18.0] Fix documentation formatting and improve code examples (#8701)

### What this PR does / why we need it?
This PR fixes various documentation issues and improves code examples
throughout the project.

Signed-off-by: MrZ20 <2609716663@qq.com>
This commit is contained in:
SILONG ZENG
2026-04-28 09:01:25 +08:00
committed by GitHub
parent 9a0b786f2b
commit 2e2aaa2fae
38 changed files with 205 additions and 188 deletions

View File

@@ -54,25 +54,25 @@ To enable Netloader, pass `--load-format=netloader` and provide configuration vi
### Server
```shell
VLLM_SLEEP_WHEN_IDLE=1 vllm serve `<model_file>` \
VLLM_SLEEP_WHEN_IDLE=1 vllm serve <model_file> \
--tensor-parallel-size 1 \
--served-model-name `<model_name>` \
--served-model-name <model_name> \
--enforce-eager \
--port `<port>` \
--port <port> \
--load-format netloader
```
### Client
```shell
export NETLOADER_CONFIG='{"SOURCE":[{"device_id":0, "sources": ["`<server_IP>`:`<server_Port>`"]}]}'
export NETLOADER_CONFIG='{"SOURCE":[{"device_id":0, "sources": ["<server_IP>:<server_Port>"]}]}'
VLLM_SLEEP_WHEN_IDLE=1 ASCEND_RT_VISIBLE_DEVICES=`<device_id_diff_from_server>` \
vllm serve `<model_file>` \
VLLM_SLEEP_WHEN_IDLE=1 ASCEND_RT_VISIBLE_DEVICES=<device_id_diff_from_server> \
vllm serve <model_file> \
--tensor-parallel-size 1 \
--served-model-name `<model_name>` \
--served-model-name <model_name> \
--enforce-eager \
--port `<client_port>` \
--port <client_port> \
--load-format netloader \
--model-loader-extra-config="${NETLOADER_CONFIG}"
```