Files
vllm-ascend-ci 3a2a7d88db [Doc] Update accuracy reports for v0.10.1rc1 (#2755)
The accuracy results running on NPU Altlas A2 have changed, updating
reports for: All models (Qwen3-30B-A3B, Qwen2.5-VL-7B-Instruct,
Qwen3-8B-Base, DeepSeek-V2-Lite)

  - [Workflow run][1]
  
[1]:
https://github.com/vllm-project/vllm-ascend/actions/runs/17459225764
- vLLM version: v0.10.1.1
- vLLM main:
2b30afa442

Signed-off-by: vllm-ascend-ci <vllm-ascend-ci@users.noreply.github.com>
Co-authored-by: vllm-ascend-ci <vllm-ascend-ci@users.noreply.github.com>
2025-09-04 22:17:17 +08:00

1.0 KiB

Qwen/Qwen3-30B-A3B

  • vLLM Version: vLLM: 0.10.1.1 (1da94e6), vLLM Ascend Version: v0.10.1rc1 (7e16b4a)
  • Software Environment: CANN: 8.2.RC1, PyTorch: 2.7.1, torch-npu: 2.7.1.dev20250724
  • Hardware Environment: Atlas A2 Series
  • Parallel mode: TP2 + EP
  • Execution mode: ACLGraph

Command:

export MODEL_ARGS='pretrained=Qwen/Qwen3-30B-A3B,tensor_parallel_size=2,dtype=auto,trust_remote_code=False,max_model_len=4096,gpu_memory_utilization=0.6,enable_expert_parallel=True'
lm_eval --model vllm --model_args $MODEL_ARGS --tasks gsm8k,ceval-valid \
   --num_fewshot 5   --batch_size auto
Task Metric Value Stderr
gsm8k exact_match,strict-match 0.8923 ± 0.0085
gsm8k exact_match,flexible-extract 0.8506 ± 0.0098
ceval-valid acc,none 0.8358 ± 0.0099