xc-llm-ascend/tests/e2e/models/report_template.md

# {{ model_name }}

**vLLM Version**: vLLM: {{ vllm_version }} ([{{ vllm_commit[:7] }}](https://github.com/vllm-project/vllm/commit/{{ vllm_commit }})),
**vLLM Ascend Version**: {{ vllm_ascend_version }} ([{{ vllm_ascend_commit[:7] }}](https://github.com/vllm-project/vllm-ascend/commit/{{ vllm_ascend_commit }}))  
**Software Environment**: CANN: {{ cann_version }}, PyTorch: {{ torch_version }}, torch-npu: {{ torch_npu_version }}  
**Hardware Environment**: Atlas A2 Series  
**Datasets**: {{ datasets }}  
**Parallel Mode**: TP  
**Execution Mode**: ACLGraph  

**Command**:  

```bash
export MODEL_ARGS={{ model_args }}
lm_eval --model {{ model_type }} --model_args $MODEL_ARGS --tasks {{ datasets }} \
--apply_chat_template {{ apply_chat_template }} --fewshot_as_multiturn {{ fewshot_as_multiturn }} {% if num_fewshot is defined and num_fewshot != "N/A" %} --num_fewshot {{ num_fewshot }} {% endif %} \
--limit {{ limit }} --batch_size {{ batch_size}}
```

| Task                  | Metric      | Value     | Stderr |
|-----------------------|-------------|----------:|-------:|
{% for row in rows -%}
| {{ row.task.rjust(23) }} | {{ row.metric.rjust(15) }} |{{ row.value }} | ± {{ "%.4f" | format(row.stderr | float) }} |
{% endfor %}
Enable pytest and yaml style accuracy test (#2073) ### What this PR does / why we need it? This PR enabled pytest and yaml style accuracy test, users now can enable accuracy test by running: ```bash cd ~/vllm-ascend pytest -sv ./tests/e2e/singlecard/models/test_lm_eval_correctness.py \ --config ./tests/e2e/singlecard/models/configs/Qwen3-8B-Base.yaml \ --report_output ./benchmarks/accuracy/Qwen3-8B-Base.md pytest -sv ./tests/e2e/singlecard/models/test_lm_eval_correctness.py \ --config-list-file ./tests/e2e/singlecard/models/configs/accuracy.txt ``` Closes: https://github.com/vllm-project/vllm-ascend/issues/1970 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? - vLLM version: v0.10.0 - vLLM main: https://github.com/vllm-project/vllm/commit/2836dd73f13015ee386c544760ca0d16888203f3 --------- Signed-off-by: Icey <1790571317@qq.com> 2025-07-31 21:39:13 +08:00			`# {{ model_name }}`

			`vLLM Version: vLLM: {{ vllm_version }} ([{{ vllm_commit[:7] }}](https://github.com/vllm-project/vllm/commit/{{ vllm_commit }})),`
			`vLLM Ascend Version: {{ vllm_ascend_version }} ([{{ vllm_ascend_commit[:7] }}](https://github.com/vllm-project/vllm-ascend/commit/{{ vllm_ascend_commit }}))`
			`Software Environment: CANN: {{ cann_version }}, PyTorch: {{ torch_version }}, torch-npu: {{ torch_npu_version }}`
			`Hardware Environment: Atlas A2 Series`
			`Datasets: {{ datasets }}`
			`Parallel Mode: TP`
			`Execution Mode: ACLGraph`

			`Command:`

			```bash
			`export MODEL_ARGS={{ model_args }}`
			`lm_eval --model {{ model_type }} --model_args $MODEL_ARGS --tasks {{ datasets }} \`
			`--apply_chat_template {{ apply_chat_template }} --fewshot_as_multiturn {{ fewshot_as_multiturn }} {% if num_fewshot is defined and num_fewshot != "N/A" %} --num_fewshot {{ num_fewshot }} {% endif %} \`
			`--limit {{ limit }} --batch_size {{ batch_size}}`
			```

			`\| Task \| Metric \| Value \| Stderr \|`
			`\|-----------------------\|-------------\|----------:\|-------:\|`
			`{% for row in rows -%}`
			`\| {{ row.task.rjust(23) }} \| {{ row.metric.rjust(15) }} \|{{ row.value }} \| ± {{ "%.4f" \| format(row.stderr \| float) }} \|`
			`{% endfor %}`