[doc]update --max-num-seqs in Qwen3-235b tutorial (#6197)

### What this PR does / why we need it? This pr update --max-num-seqs in Qwen3-235b single-node-deployment tutorial to ensure running into graph mode correctly. - vLLM version: v0.14.0 - vLLM main: d68209402d Signed-off-by: Angazenn <supperccell@163.com>
2026-01-23 17:11:10 +08:00
parent af4dbb6b26
commit 1e116829ac
1 changed files with 1 additions and 1 deletions
--- a/docs/source/tutorials/Qwen3-235B-A22B.md
+++ b/docs/source/tutorials/Qwen3-235B-A22B.md
@@ -112,7 +112,7 @@ vllm serve vllm-ascend/Qwen3-235B-A22B-w8a8 \
 --seed 1024 \
 --quantization ascend \
 --served-model-name qwen3 \
--max-num-seqs 4 \
+--max-num-seqs 32 \
 --max-model-len 133000 \
 --max-num-batched-tokens 8096 \
 --enable-expert-parallel \