[v0.18.0][CI] Fix and simplify the CI for Qwen3 32B (#8093)

### What this PR does / why we need it? This PR fixes and simplifies the CI configuration for Qwen3 32B. The main changes are: - Remove the redundant `Qwen3-32B-Int8-A3-Feature-Stack3.yaml` config and consolidate the CI setup into `Qwen3-32B-Int8.yaml`. - Improve runtime stability by adding `PYTORCH_NPU_ALLOC_CONF=expandable_segments:True` and setting `--max-num-seqs 80`. - Update the accuracy benchmark from `aime2024` to `gsm8k-lite`, and adjust the related dataset config, output length, baseline, and threshold accordingly. These changes make the Qwen3 32B CI easier to maintain and more stable in nightly validation. --------- Signed-off-by: ZYang6263 <zy626375@gmail.com>
2026-04-10 14:22:24 +08:00
parent 531d0e6fff
commit 34386c8896
3 changed files with 17 additions and 88 deletions
--- a/.github/workflows/schedule_nightly_test_a3.yaml
+++ b/.github/workflows/schedule_nightly_test_a3.yaml
@@ -214,9 +214,6 @@ jobs:
          - name: qwen2-5-vl-32b
            os: linux-aarch64-a3-4
            config_file_path: Qwen2.5-VL-32B-Instruct.yaml
-          - name: qwen3-32b-int8-a3-feature-stack3
-            os: linux-aarch64-a3-4
-            config_file_path: Qwen3-32B-Int8-A3-Feature-Stack3.yaml
          - name: qwen3-32b-int8-prefix-cache
            os: linux-aarch64-a3-4
            config_file_path: Prefix-Cache-Qwen3-32B-Int8.yaml