[BugFix] Fix Smoke Testing Bug for DSR1 longseq (#5613)

### What this PR does / why we need it? Fix Smoke Testing Bug for DSR1 longseq We need to make this change because the daily smoke test case is throwing an error: "max_tokens or max_completion_tokens is too large: 32768.This model's maximum context length is 32768 tokens and your request has 128 input tokens". We encounter this error due to max-out-len equals to max-model-len. We can fix this error by increasing max-model-len argument in the script. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: 7157596103 Signed-off-by: daishixun <dsxsteven@sina.com>
2026-01-05 22:40:28 +08:00
parent 8eae949d11
commit 129ba9fe1b
1 changed files with 2 additions and 2 deletions
--- a/tests/e2e/nightly/multi_node/config/DeepSeek-R1-W8A8-longseq.yaml
+++ b/tests/e2e/nightly/multi_node/config/DeepSeek-R1-W8A8-longseq.yaml
@@ -34,7 +34,7 @@ deployment:
          --seed 1024
          --quantization ascend
          --max-num-seqs 4
-          --max-model-len 32768
+          --max-model-len 36864
          --max-num-batched-tokens 16384
          --trust-remote-code
          --gpu-memory-utilization 0.9
@@ -72,7 +72,7 @@ deployment:
        --seed 1024
        --quantization ascend
        --max-num-seqs 4
-        --max-model-len 32768
+        --max-model-len 36864
        --max-num-batched-tokens 256
        --trust-remote-code
        --gpu-memory-utilization 0.9