[BugFix] Fix Smoke Testing Bug for DSR1 longseq (#5613)
### What this PR does / why we need it?
Fix Smoke Testing Bug for DSR1 longseq
We need to make this change because the daily smoke test case is
throwing an error: "max_tokens or max_completion_tokens is too large:
32768.This model's maximum context length is 32768 tokens and your
request has 128 input tokens". We encounter this error due to
max-out-len equals to max-model-len. We can fix this error by increasing
max-model-len argument in the script.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.13.0
- vLLM main:
7157596103
Signed-off-by: daishixun <dsxsteven@sina.com>
This commit is contained in:
@@ -34,7 +34,7 @@ deployment:
|
||||
--seed 1024
|
||||
--quantization ascend
|
||||
--max-num-seqs 4
|
||||
--max-model-len 32768
|
||||
--max-model-len 36864
|
||||
--max-num-batched-tokens 16384
|
||||
--trust-remote-code
|
||||
--gpu-memory-utilization 0.9
|
||||
@@ -72,7 +72,7 @@ deployment:
|
||||
--seed 1024
|
||||
--quantization ascend
|
||||
--max-num-seqs 4
|
||||
--max-model-len 32768
|
||||
--max-model-len 36864
|
||||
--max-num-batched-tokens 256
|
||||
--trust-remote-code
|
||||
--gpu-memory-utilization 0.9
|
||||
|
||||
Reference in New Issue
Block a user