Upgrade vLLM to v0.10.0 (#1927)

### What this PR does / why we need it?
- Upgrade to v0.10.0
- Drop v0.9.2 version compatibility
- Add patch for
`vllm_ascend/patch/worker/patch_common/patch_sampler_gather_logprobs.py`
as workaround of
f3a683b7c9
for v0.10.0 and also add e2e test `test_models_prompt_logprobs`
- Pin transformers<4.54.0 as workaround of
https://github.com/vllm-project/vllm-ascend/issues/2034

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Test locally:
`VLLM_USE_MODELSCOPE=true pytest -sv
tests/e2e/singlecard/test_offline_inference.py::test_models_prompt_logprobs`
- CI passed

- vLLM version: v0.9.2
- vLLM main:
7728dd77bb

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
This commit is contained in:
Yikun Jiang
2025-07-26 15:43:29 +08:00
committed by GitHub
parent 2f50304c19
commit 17a430f7b8
29 changed files with 198 additions and 251 deletions

View File

@@ -37,7 +37,7 @@ on:
# Current supported vLLM versions
options:
- main
- v0.9.2
- v0.10.0
- v0.9.1
- v0.7.3
vllm-ascend-version:
@@ -163,7 +163,7 @@ jobs:
repository: vllm-project/vllm
path: ./vllm-empty
# Please also update this when bump matched version
ref: ${{ github.event.inputs.vllm-version || 'v0.9.2' }}
ref: ${{ github.event.inputs.vllm-version || 'v0.10.0' }}
- name: Install vllm-project/vllm from source
working-directory: ./vllm-empty

View File

@@ -51,7 +51,7 @@ jobs:
strategy:
matrix:
include:
- vllm_branch: v0.9.2
- vllm_branch: v0.10.0
vllm_ascend_branch: main
vllm_use_v1: 1
max-parallel: 1

View File

@@ -81,7 +81,7 @@ jobs:
VLLM_USE_MODELSCOPE: True
strategy:
matrix:
vllm_version: [main, v0.9.2]
vllm_version: [main, v0.10.0]
steps:
- name: Install packages
run: |
@@ -137,7 +137,7 @@ jobs:
max-parallel: 2
matrix:
os: [linux-arm64-npu-1]
vllm_version: [main, v0.9.2]
vllm_version: [main, v0.10.0]
name: singlecard e2e test
runs-on: ${{ matrix.os }}
container:
@@ -216,7 +216,7 @@ jobs:
max-parallel: 1
matrix:
os: [linux-arm64-npu-4]
vllm_version: [main, v0.9.2]
vllm_version: [main, v0.10.0]
name: multicard e2e test
runs-on: ${{ matrix.os }}
container:

View File

@@ -43,7 +43,7 @@ jobs:
max-parallel: 2
matrix:
os: [linux-arm64-npu-1, linux-arm64-npu-4]
vllm_version: [main, v0.9.2]
vllm_version: [main, v0.10.0]
name: vLLM Ascend long term test
runs-on: ${{ matrix.os }}
container: