xc-llm-ascend/tests/e2e/models/configs/Qwen3-30B-A3B-W8A8.yaml

model_name: "vllm-ascend/Qwen3-30B-A3B-W8A8"
hardware: "Atlas A2 Series"
tasks:
- name: "gsm8k"
  metrics:
  - name: "exact_match,strict-match"
    value: 0.9
  - name: "exact_match,flexible-extract"
    value: 0.8
num_fewshot: 5
gpu_memory_utilization: 0.7
enable_expert_parallel: True
tensor_parallel_size: 2
apply_chat_template: False
fewshot_as_multiturn: False
quantization: ascend
[Test] Add accuracy test for qwen3-30b-a3b-w8a8 (#3807) ### What this PR does / why we need it? Add accuracy test for qwen3-30b-a3b-w8a8 This PR depends on https://github.com/vllm-project/vllm-ascend/pull/3799 ### How was this patch tested? qwen3-30b-a3b-w8a8 accuarcy test ok: https://github.com/vllm-project/vllm-ascend/actions/runs/19062045267/job/54443732877?pr=3807 - vLLM version: v0.11.0 - vLLM main: https://github.com/vllm-project/vllm/commit/83f478bb19489b41e9d208b47b4bb5a95ac171ac Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-11-04 18:56:31 +08:00			`model_name: "vllm-ascend/Qwen3-30B-A3B-W8A8"`
			`hardware: "Atlas A2 Series"`
			`tasks:`
			`- name: "gsm8k"`
			`metrics:`
			`- name: "exact_match,strict-match"`
			`value: 0.9`
			`- name: "exact_match,flexible-extract"`
			`value: 0.8`
			`num_fewshot: 5`
			`gpu_memory_utilization: 0.7`
			`enable_expert_parallel: True`
			`tensor_parallel_size: 2`
			`apply_chat_template: False`
			`fewshot_as_multiturn: False`
			`quantization: ascend`