[Doc][v0.18.0] Fix documentation formatting and improve code examples (#8701)

### What this PR does / why we need it?
This PR fixes various documentation issues and improves code examples
throughout the project.

Signed-off-by: MrZ20 <2609716663@qq.com>
This commit is contained in:
SILONG ZENG
2026-04-28 09:01:25 +08:00
committed by GitHub
parent 9a0b786f2b
commit 2e2aaa2fae
38 changed files with 205 additions and 188 deletions

View File

@@ -72,7 +72,7 @@ Run the following script to start the vLLM server on single 910B4:
```shell
#!/bin/sh
export VLLM_USE_MODELSCOPE=true
export VLLM_USE_MODELSCOPE=True
export MODEL_PATH="PaddlePaddle/PaddleOCR-VL"
export TASK_QUEUE_ENABLE=1
export CPU_AFFINITY_CONF=1
@@ -97,11 +97,11 @@ Run the following script to start the vLLM server on single Atlas 300 inference
```shell
#!/bin/sh
export VLLM_USE_MODELSCOPE=true
export VLLM_USE_MODELSCOPE=True
export MODEL_PATH="PaddlePaddle/PaddleOCR-VL"
vllm serve ${MODEL_PATH} \
--max_model_len 16384 \
--max-model-len 16384 \
--served-model-name PaddleOCR-VL-0.9B \
--trust-remote-code \
--no-enable-prefix-caching \
@@ -112,7 +112,7 @@ vllm serve ${MODEL_PATH} \
```
:::{note}
The `--max_model_len` option is added to prevent errors when generating the attention operator mask on the Atlas 300 inference products.
The `--max-model-len` option is added to prevent errors when generating the attention operator mask on the Atlas 300 inference products.
:::
::::