[Doc] Update Qwen model accuracy report

This commit is contained in:
hongweijie
2025-12-10 17:55:27 +08:00
parent ec935627cb
commit bd66cfa6c2
3 changed files with 55 additions and 0 deletions

View File

@@ -0,0 +1,19 @@
# Qwen2.5-32B
* vLLM Version: vLLM: 0.10.1.1 , vLLM-KunLun Version: v0.10.1.1
* Software Environment:OS: Ubuntu 22.04, PyTorch ≥ 2.5.1
* Hardware Environment: KunLun P800
* Parallel mode:TP4
```bash
+-----------+--------------------------+------------------+------+--------+---------+
| Dataset | Metric | Subset | Num | Score | Cat.0 |
+-----------+--------------------------+------------------+------+--------+---------+
| gsm8k | mean_acc | main | 1319 | 0.9158 | default |
| humaneval | pass@1 | openai_humaneval | 164 | 0.878 | default |
| ifeval | mean_prompt_level_strict | default | 541 | 0.8059 | default |
| ifeval | mean_inst_level_strict | default | 541 | 0.8765 | default |
| ifeval | mean_prompt_level_loose | default | 541 | 0.8262 | default |
| ifeval | mean_inst_level_loose | default | 541 | 0.8916 | default |
+-----------+--------------------------+------------------+------+--------+---------+
```

View File

@@ -0,0 +1,16 @@
# Qwen3-30B-A3B-coder
* vLLM Version: vLLM: 0.10.1.1 , vLLM-KunLun Version: v0.10.1.1
* Software Environment:OS: Ubuntu 22.04, PyTorch ≥ 2.5.1
* Hardware Environment: KunLun P800
* Parallel mode:TP4
```bash
+-----------------+-------------+--------------------+------+--------+---------+
| Dataset | Metric | Subset | Num | Score | Cat.0 |
+-----------------+-------------+--------------------+------+--------+---------+
| gsm8k | mean_acc | main | 1319 | 0.9272 | default |
| humaneval | pass@1 | openai_humaneval | 164 | 0.9146 | default |
| live_code_bench | pass@1 | release_latest | 714 | 0.5644 | default |
+-----------------+-------------+--------------------+------+--------+---------+
```

View File

@@ -0,0 +1,20 @@
# Qwen3-8B
* vLLM Version: vLLM: 0.10.1.1 , vLLM-KunLun Version: v0.10.1.1
* Software Environment:OS: Ubuntu 22.04, PyTorch ≥ 2.5.1
* Hardware Environment: KunLun P800
* Parallel mode:TP1
```bash
+-----------+--------------------------+--------------------+------+--------+---------+
| Dataset | Metric | Subset | Num | Score | Cat.0 |
+-----------+--------------------------+--------------------+------+--------+---------+
| gsm8k | mean_acc | main | 1319 | 0.9143 | default |
| humaneval | pass@1 | openai_humaneval | 164 | 0.8049 | default |
| ifeval | mean_prompt_level_strict | default | 541 | 0.8503 | default |
| ifeval | mean_inst_level_strict | default | 541 | 0.8971 | default |
| ifeval | mean_prompt_level_loose | default | 541 | 0.8762 | default |
| ifeval | mean_inst_level_loose | default | 541 | 0.9165 | default |
| math_500 | mean_acc | Level 1 | 43 | 0.907 | default |
+-----------+--------------------------+--------------------+------+--------+---------+
```