The accuracy results running on NPU Altlas A2 have changed, updating
reports for: All models (Qwen3-30B-A3B, Qwen2.5-VL-7B-Instruct,
Qwen3-8B-Base, DeepSeek-V2-Lite)
- [Workflow run][1]
[1]:
https://github.com/vllm-project/vllm-ascend/actions/runs/17459225764
- vLLM version: v0.10.1.1
- vLLM main:
2b30afa442
Signed-off-by: vllm-ascend-ci <vllm-ascend-ci@users.noreply.github.com>
Co-authored-by: vllm-ascend-ci <vllm-ascend-ci@users.noreply.github.com>
928 B
928 B
Qwen/Qwen2.5-VL-7B-Instruct
- vLLM Version: vLLM: 0.10.1.1 (1da94e6), vLLM Ascend Version: v0.10.1rc1 (7e16b4a)
- Software Environment: CANN: 8.2.RC1, PyTorch: 2.7.1, torch-npu: 2.7.1.dev20250724
- Hardware Environment: Atlas A2 Series
- Parallel mode: TP1
- Execution mode: ACLGraph
Command:
export MODEL_ARGS='pretrained=Qwen/Qwen2.5-VL-7B-Instruct,tensor_parallel_size=1,dtype=auto,trust_remote_code=False,max_model_len=8192'
lm_eval --model vllm-vlm --model_args $MODEL_ARGS --tasks mmmu_val \
--apply_chat_template True --fewshot_as_multiturn True --batch_size auto
| Task | Metric | Value | Stderr |
|---|---|---|---|
| mmmu_val | acc,none | ✅0.52 | ± 0.0162 |