[TEST] Add Qwen3-32b-w8a8 acc/perf A2/A3 test (#3541)

### What this PR does / why we need it?
This PR Qwen3-32b-w8a8 acc/perf 8 cases on A2 and A3, we need test them
daily.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
by running the test


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: root <root@hostname-2pbfv.foreman.pxe>
Co-authored-by: wangli <wangli858794774@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
This commit is contained in:
jiangyunfan1
2025-10-21 17:34:48 +08:00
committed by GitHub
parent ec1d2b5c04
commit 80b8df881f
6 changed files with 307 additions and 3 deletions

View File

@@ -109,6 +109,7 @@ jobs:
env:
VLLM_WORKER_MULTIPROC_METHOD: spawn
VLLM_USE_MODELSCOPE: True
VLLM_CI_RUNNER: ${{ inputs.runner }}
run: |
# TODO: enable more tests
pytest -sv ${{ inputs.tests }}

View File

@@ -41,7 +41,7 @@ defaults:
# and ignore the lint / 1 card / 4 cards test type
concurrency:
group: ascend-nightly-${{ github.ref }}
cancel-in-progress: true
#cancel-in-progress: true
jobs:
qwen3-32b:
@@ -56,3 +56,22 @@ jobs:
vllm: v0.11.0
runner: ${{ matrix.os }}
tests: tests/e2e/nightly/models/test_qwen3_32b.py
qwen3-32b-in8-a3:
strategy:
matrix:
os: [linux-aarch64-a3-4]
uses: ./.github/workflows/_e2e_nightly.yaml
with:
vllm: v0.11.0
runner: ${{ matrix.os }}
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.2.rc1-a3-ubuntu22.04-py3.11
tests: tests/e2e/nightly/models/test_qwen3_32b_int8.py
qwen3-32b-in8-a2:
strategy:
matrix:
os: [linux-aarch64-a2-4]
uses: ./.github/workflows/_e2e_nightly.yaml
with:
vllm: v0.11.0
runner: ${{ matrix.os }}
tests: tests/e2e/nightly/models/test_qwen3_32b_int8.py