update vllm pin to 12.27 (#5412)

### What this PR does / why we need it?
update vllm pin to 12.27
1、Fix Qwen2-MoE shared_expert_gate
:https://github.com/vllm-project/vllm/pull/31339
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
vLLM version: release/v0.13.0
vLLM main:
5326c89803
Co-authored-by: leo-pony [nengjunma@outlook.com](nengjunma@outlook.com)

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
Signed-off-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
This commit is contained in:
ZT-AIA
2025-12-28 00:19:36 +08:00
committed by GitHub
parent 1b5d5abf86
commit 24328aaf00
5 changed files with 7 additions and 7 deletions

View File

@@ -34,7 +34,7 @@ jobs:
steps: steps:
- name: Get vLLM version - name: Get vLLM version
run: | run: |
VLLM_COMMIT=81786c87748b0177111dfdc07af5351d8389baa1 VLLM_COMMIT=5326c89803566a131c928f7fdd2100b75c981a42
echo "VLLM_COMMIT=https://github.com/vllm-project/vllm/commit/$VLLM_COMMIT" >> $GITHUB_ENV echo "VLLM_COMMIT=https://github.com/vllm-project/vllm/commit/$VLLM_COMMIT" >> $GITHUB_ENV
- name: Checkout repository - name: Checkout repository

View File

@@ -74,7 +74,7 @@ jobs:
name: e2e-full name: e2e-full
strategy: strategy:
matrix: matrix:
vllm_version: [81786c87748b0177111dfdc07af5351d8389baa1, v0.13.0] vllm_version: [5326c89803566a131c928f7fdd2100b75c981a42, v0.13.0]
needs: [changes] needs: [changes]
if: ${{ needs.changes.outputs.e2e_tracker == 'true' }} if: ${{ needs.changes.outputs.e2e_tracker == 'true' }}
uses: ./.github/workflows/_e2e_test.yaml uses: ./.github/workflows/_e2e_test.yaml

View File

@@ -42,7 +42,7 @@ jobs:
lint: lint:
uses: ./.github/workflows/_pre_commit.yml uses: ./.github/workflows/_pre_commit.yml
with: with:
vllm: 81786c87748b0177111dfdc07af5351d8389baa1 vllm: 5326c89803566a131c928f7fdd2100b75c981a42
changes: changes:
runs-on: linux-aarch64-a2-0 runs-on: linux-aarch64-a2-0
outputs: outputs:
@@ -90,7 +90,7 @@ jobs:
SOC_VERSION: ascend910b1 SOC_VERSION: ascend910b1
strategy: strategy:
matrix: matrix:
vllm_version: [81786c87748b0177111dfdc07af5351d8389baa1, v0.13.0] vllm_version: [5326c89803566a131c928f7fdd2100b75c981a42, v0.13.0]
steps: steps:
- name: Free up disk space - name: Free up disk space
@@ -160,7 +160,7 @@ jobs:
name: e2e-light name: e2e-light
strategy: strategy:
matrix: matrix:
vllm_version: [81786c87748b0177111dfdc07af5351d8389baa1, v0.13.0] vllm_version: [5326c89803566a131c928f7fdd2100b75c981a42, v0.13.0]
# Note (yikun): If CI resource are limited we can split job into two chain jobs # Note (yikun): If CI resource are limited we can split job into two chain jobs
needs: [lint, changes] needs: [lint, changes]
# only trigger e2e test after lint passed and the change is e2e related with pull request. # only trigger e2e test after lint passed and the change is e2e related with pull request.

View File

@@ -51,7 +51,7 @@ If you're using v0.7.3, don't forget to install [mindie-turbo](https://pypi.org/
For main branch of vLLM Ascend, we usually make it compatible with the latest vLLM release and a newer commit hash of vLLM. Please note that this table is usually updated. Please check it regularly. For main branch of vLLM Ascend, we usually make it compatible with the latest vLLM release and a newer commit hash of vLLM. Please note that this table is usually updated. Please check it regularly.
| vLLM Ascend | vLLM | Python | Stable CANN | PyTorch/torch_npu | | vLLM Ascend | vLLM | Python | Stable CANN | PyTorch/torch_npu |
|-------------|--------------|------------------|-------------|--------------------| |-------------|--------------|------------------|-------------|--------------------|
| main | 81786c87748b0177111dfdc07af5351d8389baa1, v0.13.0 tag | >= 3.10, < 3.12 | 8.3.RC2 | 2.8.0 / 2.8.0 | | main | 5326c89803566a131c928f7fdd2100b75c981a42, v0.13.0 tag | >= 3.10, < 3.12 | 8.3.RC2 | 2.8.0 / 2.8.0 |
## Release cadence ## Release cadence

View File

@@ -70,7 +70,7 @@ def test_qwen3_next_distributed_mp_eager_mtp_similarity_tp4():
"The future of AI is", "The future of AI is",
] ]
max_tokens = 20 max_tokens = 15
with VllmRunner( with VllmRunner(
"Qwen/Qwen3-Next-80B-A3B-Instruct", "Qwen/Qwen3-Next-80B-A3B-Instruct",