xc-llm-ascend/tests/e2e/models/configs/Qwen3-VL-30B-A3B-Instruct.yaml

model_name: "Qwen/Qwen3-VL-30B-A3B-Instruct"
hardware: "Atlas A2 Series"
model: "vllm-vlm"
tasks:
- name: "mmmu_val"
  metrics:
  - name: "acc,none"
    value: 0.58
max_model_len: 8192
tensor_parallel_size: 2
gpu_memory_utilization: 0.7
enable_expert_parallel: True
Add models test and add serval new models yaml (#3394) ### What this PR does / why we need it? This PR added Add accuracy CI for servals new models - `ascend test / accuracy` is for PR triggered check popluar models accuracy - `ascedn test / models` is for accuracy report, full models test, nightly model test - Add Qwen2-Audio-7B-Instruct, Qwen2-VL-7B-Instruct, Qwen3-8B, Qwen3-VL-30B-A3B-Instruct ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Closes: https://github.com/vllm-project/vllm-ascend/pull/2330 Closes: https://github.com/vllm-project/vllm-ascend/pull/3362 - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: hfadzxy <starmoon_zhang@163.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> 2025-10-12 17:27:50 +08:00			`model_name: "Qwen/Qwen3-VL-30B-A3B-Instruct"`
			`hardware: "Atlas A2 Series"`
			`model: "vllm-vlm"`
			`tasks:`
			`- name: "mmmu_val"`
			`metrics:`
			`- name: "acc,none"`
			`value: 0.58`
			`max_model_len: 8192`
			`tensor_parallel_size: 2`
			`gpu_memory_utilization: 0.7`
			`enable_expert_parallel: True`