[BugFix] Fix Smoke Testing Bug for DSR1 longseq (#5613)

### What this PR does / why we need it?
Fix Smoke Testing Bug for DSR1 longseq
We need to make this change because the daily smoke test case is
throwing an error: "max_tokens or max_completion_tokens is too large:
32768.This model's maximum context length is 32768 tokens and your
request has 128 input tokens". We encounter this error due to
max-out-len equals to max-model-len. We can fix this error by increasing
max-model-len argument in the script.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.13.0
- vLLM main:
7157596103

Signed-off-by: daishixun <dsxsteven@sina.com>
This commit is contained in:
dsxsteven
2026-01-05 22:40:28 +08:00
committed by GitHub
parent 8eae949d11
commit 129ba9fe1b

View File

@@ -34,7 +34,7 @@ deployment:
--seed 1024
--quantization ascend
--max-num-seqs 4
--max-model-len 32768
--max-model-len 36864
--max-num-batched-tokens 16384
--trust-remote-code
--gpu-memory-utilization 0.9
@@ -72,7 +72,7 @@ deployment:
--seed 1024
--quantization ascend
--max-num-seqs 4
--max-model-len 32768
--max-model-len 36864
--max-num-batched-tokens 256
--trust-remote-code
--gpu-memory-utilization 0.9