[CI] Fix server start failure when long weight loading (#7098)

### What this PR does / why we need it?
When loading large models (e.g., 163 shards), weight loading can exceed
the default 600s timeout. Engine startup timeout with the error:
```shell
TimeoutError: Timed out waiting for engines to send initial message on input socket.
```
We should increase the `VLLM_ENGINE_READY_TIMEOUT_S ` to avoid it
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.16.0
- vLLM main:
4034c3d32e

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
Li Wang
2026-03-13 08:52:56 +08:00
committed by GitHub
parent 7fe0469e27
commit 1f71da80eb

View File

@@ -76,6 +76,7 @@ jobs:
UV_INDEX_STRATEGY: unsafe-best-match
UV_NO_CACHE: 1
UV_SYSTEM_PYTHON: 1
VLLM_ENGINE_READY_TIMEOUT_S: 1800
steps:
- name: Check npu and CANN info
run: |
@@ -204,6 +205,7 @@ jobs:
VLLM_CI_RUNNER: ${{ inputs.runner }}
working-directory: /vllm-workspace/vllm-ascend
run: |
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
echo "Running pytest with tests path: ${{ inputs.tests }}"
pytest -sv "${{ inputs.tests }}" \
--ignore=tests/e2e/nightly/single_node/ops/singlecard_ops/test_fused_moe.py
@@ -217,6 +219,7 @@ jobs:
CONFIG_YAML_PATH: ${{ inputs.config_file_path }}
working-directory: /vllm-workspace/vllm-ascend
run: |
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib" >> ~/.bashrc
echo "Running YAML-driven test with config: ${{ inputs.config_file_path }}"
pytest -sv tests/e2e/nightly/single_node/models/scripts/test_single_node.py