From 1e116829ac127ec3856c82fb4c9b9b7af5c2b7cc Mon Sep 17 00:00:00 2001 From: Angazenn <92204292+Angazenn@users.noreply.github.com> Date: Fri, 23 Jan 2026 17:11:10 +0800 Subject: [PATCH] [doc]update --max-num-seqs in Qwen3-235b tutorial (#6197) ### What this PR does / why we need it? This pr update --max-num-seqs in Qwen3-235b single-node-deployment tutorial to ensure running into graph mode correctly. - vLLM version: v0.14.0 - vLLM main: https://github.com/vllm-project/vllm/commit/d68209402ddab3f54a09bc1f4de9a9495a283b60 Signed-off-by: Angazenn --- docs/source/tutorials/Qwen3-235B-A22B.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/tutorials/Qwen3-235B-A22B.md b/docs/source/tutorials/Qwen3-235B-A22B.md index 64ff19b8..85181437 100644 --- a/docs/source/tutorials/Qwen3-235B-A22B.md +++ b/docs/source/tutorials/Qwen3-235B-A22B.md @@ -112,7 +112,7 @@ vllm serve vllm-ascend/Qwen3-235B-A22B-w8a8 \ --seed 1024 \ --quantization ascend \ --served-model-name qwen3 \ ---max-num-seqs 4 \ +--max-num-seqs 32 \ --max-model-len 133000 \ --max-num-batched-tokens 8096 \ --enable-expert-parallel \