xc-llm-ascend/tests/e2e/models/configs/Qwen3-VL-8B-Instruct-W8A8.yaml

model_name: "vllm-ascend/Qwen3-VL-8B-Instruct-W8A8"
hardware: "Atlas A2 Series"
model: "vllm-vlm"
tasks:
- name: "mmmu_val"
  metrics:
  - name: "acc,none"
    value: 0.52
max_model_len: 8192
batch_size: 32
gpu_memory_utilization: 0.8
quantization: ascend
[Bugfix] fix bug about model type of qwen3_vl_8b_instruct_w8a8 (#7383) ### What this PR does / why we need it? Adapt to the model type of Qwen3-VL-8B-Instruct-W8A8 - vLLM version: v0.17.0 - vLLM main: https://github.com/vllm-project/vllm/commit/4034c3d32e30d01639459edd3ab486f56993876d --------- Signed-off-by: betta18 <jiangmengyu1@huawei.com> Co-authored-by: betta18 <jiangmengyu1@huawei.com> 2026-03-18 20:30:03 +08:00			`model_name: "vllm-ascend/Qwen3-VL-8B-Instruct-W8A8"`
			`hardware: "Atlas A2 Series"`
			`model: "vllm-vlm"`
			`tasks:`
			`- name: "mmmu_val"`
			`metrics:`
			`- name: "acc,none"`
			`value: 0.52`
			`max_model_len: 8192`
			`batch_size: 32`
			`gpu_memory_utilization: 0.8`
			`quantization: ascend`