[Doc][v0.18.0] Fix documentation formatting and improve code examples (#8701)

### What this PR does / why we need it?
This PR fixes various documentation issues and improves code examples
throughout the project.

Signed-off-by: MrZ20 <2609716663@qq.com>
This commit is contained in:
SILONG ZENG
2026-04-28 09:01:25 +08:00
committed by GitHub
parent 9a0b786f2b
commit 2e2aaa2fae
38 changed files with 205 additions and 188 deletions

View File

@@ -38,11 +38,11 @@ Run the vLLM server in the docker.
```{code-block} bash
:substitutions:
vllm serve Qwen/Qwen2.5-0.5B-Instruct --max_model_len 35000 &
vllm serve Qwen/Qwen2.5-0.5B-Instruct --max-model-len 35000 &
```
:::{note}
`--max_model_len` should be greater than `35000`, this will be suitable for most datasets. Otherwise the accuracy evaluation may be affected.
`--max-model-len` should be greater than `35000`, this will be suitable for most datasets. Otherwise the accuracy evaluation may be affected.
:::
The vLLM server is started successfully, if you see logs as below:

View File

@@ -29,7 +29,7 @@ docker run --rm \
-e VLLM_USE_MODELSCOPE=True \
-e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
-it $IMAGE \
vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
vllm serve Qwen/Qwen2.5-7B-Instruct --max-model-len 26240
```
If the vLLM server is started successfully, you can see information shown below:

View File

@@ -32,7 +32,7 @@ docker run --rm \
-e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
-it $IMAGE \
/bin/bash
vllm serve Qwen/Qwen2.5-0.5B-Instruct --max_model_len 4096 &
vllm serve Qwen/Qwen2.5-0.5B-Instruct --max-model-len 4096 &
```
The vLLM server is started successfully, if you see logs as below:
@@ -48,28 +48,36 @@ INFO: Application startup complete.
You can query the result with input prompts:
```shell
PROMPT='<|im_start|>system
You are a professional accountant. Answer questions using accounting knowledge, output only the option letter (A/B/C/D).<|im_end|>
<|im_start|>user
Question: A company'"'"'s balance sheet as of December 31, 2023 shows:
Current assets: Cash and equivalents 5 million yuan, Accounts receivable 8 million yuan, Inventory 6 million yuan
Non-current assets: Net fixed assets 12 million yuan
Current liabilities: Short-term loans 4 million yuan, Accounts payable 3 million yuan
Non-current liabilities: Long-term loans 9 million yuan
Owner'"'"'s equity: Paid-in capital 10 million yuan, Retained earnings ?
Requirement: Calculate the company'"'"'s Asset-Liability Ratio and Current Ratio (round to two decimal places).
Options:
A. Asset-Liability Ratio=58.33%, Current Ratio=1.90
B. Asset-Liability Ratio=62.50%, Current Ratio=2.17
C. Asset-Liability Ratio=65.22%, Current Ratio=1.75
D. Asset-Liability Ratio=68.00%, Current Ratio=2.50<|im_end|>
<|im_start|>assistant
'
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-0.5B-Instruct",
"prompt": "'"<|im_start|>system\nYou are a professional accountant. Answer questions using accounting knowledge, output only the option letter (A/B/C/D).<|im_end|>\n"\
"<|im_start|>user\nQuestion: A company's balance sheet as of December 31, 2023 shows:\n"\
" Current assets: Cash and equivalents 5 million yuan, Accounts receivable 8 million yuan, Inventory 6 million yuan\n"\
" Non-current assets: Net fixed assets 12 million yuan\n"\
" Current liabilities: Short-term loans 4 million yuan, Accounts payable 3 million yuan\n"\
" Non-current liabilities: Long-term loans 9 million yuan\n"\
" Owner's equity: Paid-in capital 10 million yuan, Retained earnings ?\n"\
"Requirement: Calculate the company's Asset-Liability Ratio and Current Ratio (round to two decimal places).\n"\
"Options:\n"\
"A. Asset-Liability Ratio=58.33%, Current Ratio=1.90\n"\
"B. Asset-Liability Ratio=62.50%, Current Ratio=2.17\n"\
"C. Asset-Liability Ratio=65.22%, Current Ratio=1.75\n"\
"D. Asset-Liability Ratio=68.00%, Current Ratio=2.50<|im_end|>\n"\
"<|im_start|>assistant\n"'",
"max_completion_tokens": 1,
"temperature": 0,
"stop": ["<|im_end|>"]
}' | python3 -m json.tool
-d "$(jq -n \
--arg model "Qwen/Qwen2.5-0.5B-Instruct" \
--arg prompt "$PROMPT" \
'{
model: $model,
prompt: $prompt,
max_completion_tokens: 1,
temperature: 0,
stop: ["<|im_end|>"]
}')" | python3 -m json.tool
```
The output format matches the following:

View File

@@ -29,7 +29,7 @@ docker run --rm \
-e VLLM_USE_MODELSCOPE=True \
-e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
-it $IMAGE \
vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
vllm serve Qwen/Qwen2.5-7B-Instruct --max-model-len 26240
```
The vLLM server is started successfully, if you see information as below: