[Doc] Upgrade some outdated doc (#5062)

### What this PR does / why we need it? Upgrade some outdated doc to make run happily Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-16 11:48:19 +08:00
parent bb3a826e08
commit a63ef031af
3 changed files with 13 additions and 7 deletions
--- a/docs/source/tutorials/Qwen3-8B-W4A8.md
+++ b/docs/source/tutorials/Qwen3-8B-W4A8.md
@@ -90,7 +90,9 @@ The converted model files look like:
 Run the following script to start the vLLM server with the quantized model:

 ```bash
-vllm serve /home/models/Qwen3-8B-w4a8 --served-model-name "qwen3-8b-w4a8" --max-model-len 4096 --quantization ascend
+export VLLM_USE_MODELSCOPE=true
+export MODEL_PATH=vllm-ascend/Qwen3-8B-W4A8
+vllm serve ${MODEL_PATH} --served-model-name "qwen3-8b-w4a8" --max-model-len 4096 --quantization ascend
 ```

 Once your server is started, you can query the model with input prompts.