[Doc][v0.18.0] Fix documentation formatting and improve code examples (#8701)

### What this PR does / why we need it?
This PR fixes various documentation issues and improves code examples
throughout the project.

Signed-off-by: MrZ20 <2609716663@qq.com>
This commit is contained in:
SILONG ZENG
2026-04-28 09:01:25 +08:00
committed by GitHub
parent 9a0b786f2b
commit 2e2aaa2fae
38 changed files with 205 additions and 188 deletions

View File

@@ -158,6 +158,12 @@ Scheduling optimization:
:substitutions:
# Optimize operator delivery queue. This will affect the memory peak value, and may degrade if the memory is tight.
export TASK_QUEUE_ENABLE=2
```
or
```{code-block} bash
:substitutions:
# This will greatly improve the CPU bottleneck model and ensure the same performance for the NPU bottleneck model.
export CPU_AFFINITY_CONF=1

View File

@@ -223,7 +223,7 @@ vllm serve Qwen/Qwen3-Embedding-8B --trust-remote-code
```shell
# download dataset
# wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
export VLLM_USE_MODELSCOPE=true
export VLLM_USE_MODELSCOPE=True
vllm bench serve \
--model Qwen/Qwen3-Embedding-8B \
--backend openai-embeddings \