From 129ba9fe1bc44536917318b007c1778635ee1a34 Mon Sep 17 00:00:00 2001
From: dsxsteven <36877507+dsxsteven@users.noreply.github.com>
Date: Mon, 5 Jan 2026 22:40:28 +0800
Subject: [PATCH] [BugFix] Fix Smoke Testing Bug for DSR1 longseq (#5613)

### What this PR does / why we need it?
Fix Smoke Testing Bug for DSR1 longseq
We need to make this change because the daily smoke test case is
throwing an error: "max_tokens or max_completion_tokens is too large:
32768.This model's maximum context length is 32768 tokens and your
request has 128 input tokens". We encounter this error due to
max-out-len equals to max-model-len. We can fix this error by increasing
max-model-len argument in the script.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.13.0
- vLLM main:
https://github.com/vllm-project/vllm/commit/7157596103666ee7ccb7008acee8bff8a8ff1731

Signed-off-by: daishixun <dsxsteven@sina.com>
---
 .../nightly/multi_node/config/DeepSeek-R1-W8A8-longseq.yaml   | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/e2e/nightly/multi_node/config/DeepSeek-R1-W8A8-longseq.yaml b/tests/e2e/nightly/multi_node/config/DeepSeek-R1-W8A8-longseq.yaml
index bc88aaaa..e6bbd7ae 100644
--- a/tests/e2e/nightly/multi_node/config/DeepSeek-R1-W8A8-longseq.yaml
+++ b/tests/e2e/nightly/multi_node/config/DeepSeek-R1-W8A8-longseq.yaml
@@ -34,7 +34,7 @@ deployment:
           --seed 1024
           --quantization ascend
           --max-num-seqs 4
-          --max-model-len 32768
+          --max-model-len 36864
           --max-num-batched-tokens 16384
           --trust-remote-code
           --gpu-memory-utilization 0.9
@@ -72,7 +72,7 @@ deployment:
         --seed 1024
         --quantization ascend
         --max-num-seqs 4
-        --max-model-len 32768
+        --max-model-len 36864
         --max-num-batched-tokens 256
         --trust-remote-code
         --gpu-memory-utilization 0.9