[Doc][v0.18.0] Fix documentation formatting and improve code examples (#8701)

### What this PR does / why we need it? This PR fixes various documentation issues and improves code examples throughout the project. Signed-off-by: MrZ20 <2609716663@qq.com>
2026-04-28 09:01:25 +08:00
parent 9a0b786f2b
commit 2e2aaa2fae
38 changed files with 205 additions and 188 deletions
--- a/docs/source/developer_guide/evaluation/using_ais_bench.md
+++ b/docs/source/developer_guide/evaluation/using_ais_bench.md
@@ -38,11 +38,11 @@ Run the vLLM server in the docker.

 ```{code-block} bash
   :substitutions:
-vllm serve Qwen/Qwen2.5-0.5B-Instruct --max_model_len 35000 &
+vllm serve Qwen/Qwen2.5-0.5B-Instruct --max-model-len 35000 &
 ```

 :::{note}
-`--max_model_len` should be greater than `35000`, this will be suitable for most datasets. Otherwise the accuracy evaluation may be affected.
+`--max-model-len` should be greater than `35000`, this will be suitable for most datasets. Otherwise the accuracy evaluation may be affected.
 :::

 The vLLM server is started successfully, if you see logs as below:
--- a/docs/source/developer_guide/evaluation/using_evalscope.md
+++ b/docs/source/developer_guide/evaluation/using_evalscope.md
@@ -29,7 +29,7 @@ docker run --rm \
 -e VLLM_USE_MODELSCOPE=True \
 -e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
 -it $IMAGE \
-vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
+vllm serve Qwen/Qwen2.5-7B-Instruct --max-model-len 26240
 ```

 If the vLLM server is started successfully, you can see information shown below:
--- a/docs/source/developer_guide/evaluation/using_lm_eval.md
+++ b/docs/source/developer_guide/evaluation/using_lm_eval.md
@@ -32,7 +32,7 @@ docker run --rm \
 -e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
 -it $IMAGE \
 /bin/bash
-vllm serve Qwen/Qwen2.5-0.5B-Instruct --max_model_len 4096 &
+vllm serve Qwen/Qwen2.5-0.5B-Instruct --max-model-len 4096 &
 ```

 The vLLM server is started successfully, if you see logs as below:
@@ -48,28 +48,36 @@ INFO:     Application startup complete.
 You can query the result with input prompts:

 ```shell
+PROMPT='<|im_start|>system
+You are a professional accountant. Answer questions using accounting knowledge, output only the option letter (A/B/C/D).<|im_end|>
+<|im_start|>user
+Question: A company'"'"'s balance sheet as of December 31, 2023 shows:
+  Current assets: Cash and equivalents 5 million yuan, Accounts receivable 8 million yuan, Inventory 6 million yuan
+  Non-current assets: Net fixed assets 12 million yuan
+  Current liabilities: Short-term loans 4 million yuan, Accounts payable 3 million yuan
+  Non-current liabilities: Long-term loans 9 million yuan
+  Owner'"'"'s equity: Paid-in capital 10 million yuan, Retained earnings ?
+Requirement: Calculate the company'"'"'s Asset-Liability Ratio and Current Ratio (round to two decimal places).
+Options:
+A. Asset-Liability Ratio=58.33%, Current Ratio=1.90
+B. Asset-Liability Ratio=62.50%, Current Ratio=2.17
+C. Asset-Liability Ratio=65.22%, Current Ratio=1.75
+D. Asset-Liability Ratio=68.00%, Current Ratio=2.50<|im_end|>
+<|im_start|>assistant
+'
+
 curl http://localhost:8000/v1/completions \
    -H "Content-Type: application/json" \
-    -d '{
-        "model": "Qwen/Qwen2.5-0.5B-Instruct",
-        "prompt": "'"<|im_start|>system\nYou are a professional accountant. Answer questions using accounting knowledge, output only the option letter (A/B/C/D).<|im_end|>\n"\
-"<|im_start|>user\nQuestion: A company's balance sheet as of December 31, 2023 shows:\n"\
-"  Current assets: Cash and equivalents 5 million yuan, Accounts receivable 8 million yuan, Inventory 6 million yuan\n"\
-"  Non-current assets: Net fixed assets 12 million yuan\n"\
-"  Current liabilities: Short-term loans 4 million yuan, Accounts payable 3 million yuan\n"\
-"  Non-current liabilities: Long-term loans 9 million yuan\n"\
-"  Owner's equity: Paid-in capital 10 million yuan, Retained earnings ?\n"\
-"Requirement: Calculate the company's Asset-Liability Ratio and Current Ratio (round to two decimal places).\n"\
-"Options:\n"\
-"A. Asset-Liability Ratio=58.33%, Current Ratio=1.90\n"\
-"B. Asset-Liability Ratio=62.50%, Current Ratio=2.17\n"\
-"C. Asset-Liability Ratio=65.22%, Current Ratio=1.75\n"\
-"D. Asset-Liability Ratio=68.00%, Current Ratio=2.50<|im_end|>\n"\
-"<|im_start|>assistant\n"'",
-        "max_completion_tokens": 1,
-        "temperature": 0,
-        "stop": ["<|im_end|>"]
-    }' | python3 -m json.tool
+    -d "$(jq -n \
+        --arg model "Qwen/Qwen2.5-0.5B-Instruct" \
+        --arg prompt "$PROMPT" \
+        '{
+            model: $model,
+            prompt: $prompt,
+            max_completion_tokens: 1,
+            temperature: 0,
+            stop: ["<|im_end|>"]
+        }')" | python3 -m json.tool
 ```

 The output format matches the following:
--- a/docs/source/developer_guide/evaluation/using_opencompass.md
+++ b/docs/source/developer_guide/evaluation/using_opencompass.md
@@ -29,7 +29,7 @@ docker run --rm \
 -e VLLM_USE_MODELSCOPE=True \
 -e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
 -it $IMAGE \
-vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
+vllm serve Qwen/Qwen2.5-7B-Instruct --max-model-len 26240
 ```

 The vLLM server is started successfully, if you see information as below: