[CI] Use offline mode for nightly test (#5187)

### What this PR does / why we need it?
For single node test, the lack of a retry mechanism for accessing
ModelScope resulted in an HTTP 400 error sometimes. I recommend using a
local offline cache instead.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
Li Wang
2025-12-19 21:21:42 +08:00
committed by GitHub
parent 14931d2a86
commit 243ab7d720
3 changed files with 5 additions and 2 deletions

View File

@@ -57,6 +57,9 @@ jobs:
timeout-minutes: 600
container:
image: ${{ inputs.image }}
env:
TRANSFORMERS_OFFLINE: 1
VLLM_USE_MODELSCOPE: True
steps:
- name: Check npu and CANN info
run: |

View File

@@ -26,7 +26,7 @@ on:
- 'cmake/**'
- 'CMakeLists.txt'
- 'csrc/**'
types: [ labeled, synchronize ]
types: [ labeled ]
push:
# Publish image when tagging, the Dockerfile in tag will be build as tag image
branches:

View File

@@ -162,7 +162,7 @@ class RemoteOpenAIServer:
self.proxy_port = proxy_port
self._start_server(model, vllm_serve_args, env_dict)
max_wait_seconds = max_wait_seconds or 1800
max_wait_seconds = max_wait_seconds or 2800
if self.disaggregated_prefill:
assert proxy_port is not None, "for disaggregated_prefill, proxy port must be provided"
self._wait_for_server_pd(timeout=max_wait_seconds)