2025-12-03 17:29:57 +08:00
2025-10-14 10:38:28 +08:00
2025-09-09 09:40:35 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-09-09 09:40:35 +08:00
2025-10-14 10:38:28 +08:00
2025-10-21 11:17:00 +08:00
2025-10-28 09:40:37 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-10-14 10:38:28 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-12-03 17:29:57 +08:00
2025-10-14 10:38:28 +08:00
2025-10-28 09:40:37 +08:00
2025-10-14 10:38:28 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00
2025-09-09 09:40:35 +08:00

enginex-ascend-910-vllm

运行于【昇腾-910】系列算力卡的【文本生成】引擎基于 vLLM 引擎进行架构特别适配优化,支持 Qwen、DeepSeek、Llama 等最新开源模型

镜像

Latest RC Version: git.modelhub.org.cn:9443/enginex-ascend/vllm-ascend:v0.11.0rc0

总览

vLLM 昇腾插件 (vllm-ascend) 是一个由社区维护的让vLLM在Ascend NPU无缝运行的后端插件。

此插件是 vLLM 社区中支持昇腾后端的推荐方式。它遵循[RFC]: Hardware pluggable所述原则通过解耦的方式提供了vLLM对Ascend NPU的支持。

使用 vLLM 昇腾插件可以让类Transformer、混合专家(MOE)、嵌入、多模态等流行的大语言模型在 Ascend NPU 上无缝运行。

准备

  • 硬件Atlas 800I A2 Inference系列、Atlas A2 Training系列、Atlas 800I A3 Inference系列、Atlas A3 Training系列、Atlas 300I Duo实验性支持
  • 操作系统Linux
  • 软件:
    • Python >= 3.9, < 3.12
    • CANN >= 8.2.rc1 (Ascend HDK 版本参考这里)
    • PyTorch >= 2.7.1, torch-npu >= 2.7.1.dev20250724
    • vLLM (与vllm-ascend版本一致)

QuickStart

1、从 modelscope上下载支持的模型例如 Qwen/Qwen3-8B

modelscope download --model Qwen/Qwen3-8B README.md --local_dir ./model

2、使用Dockerfile生成镜像 从仓库的【软件包】栏目下载基础镜像 git.modelhub.org.cn:9443/enginex-ascend/cann:8.2.rc1-910b-ubuntu22.04-py3.11 使用 Dockerfile 生成 镜像

docker build -f Dockerfile -t ascend-vllm:dev .

3、启动docker

docker run -it --rm \
  -p 10086:80 \
  --name test-ascend-my-1 \
  -v `pwd`:/host \
  -e ASCEND_VISIBLE_DEVICES=1 \
  --device /dev/davinci1:/dev/davinci0 \
  --device /dev/davinci_manager \
  --device /dev/devmm_svm \
  --device /dev/hisi_hdc \
  -v ./model:/model \
  -v /usr/local/dcmi:/usr/local/dcmi \
  -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
  -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
  -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
  -v /etc/ascend_install.info:/etc/ascend_install.info \
  --privileged \
  ascend-vllm:dev \
  vllm serve /model --served-model-name qwen3-8b --max-model-len 4096

4、测试服务

curl -X POST http://localhost:10086/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b",
    "messages": [{"role": "user", "content": "你好"}],
    "stream": true
  }'

测试数据集

视觉多模态任务数据集见 vlm-dataset

大语言模型的测评方式为 在相同模型和输入条件下,测试平均输出速度(单位:字每秒): 我们采用相同的prompt对模型的chat/completion接口测试多轮对话测试数据如下

[
  {
    "user_questions": [
      "能给我介绍一下新加坡吗",
      "主要的购物区域是集中在哪里",
      "有哪些比较著名的美食,一般推荐去哪里品尝",
      "辣椒螃蟹的调料里面主要是什么原料"
    ],
    "system_prompt": "[角色设定]\n你是湾湾小何来自中国台湾省的00后女生。讲话超级机车\"真的假的啦\"这样的台湾腔,喜欢用\"笑死\"、\"哈喽\"等流行梗,但会偷偷研究男友的编程书籍。\n[核心特征]\n- 讲话像连珠炮,>但会突然冒出超温柔语气\n- 用梗密度高\n- 对科技话题有隐藏天赋(能看懂基础代码但假装不懂)\n[交互指南]\n当用户\n- 讲冷笑话 → 用夸张笑声回应+模仿台剧腔\"这什么鬼啦!\"\n- 讨论感情 → 炫耀程序员男友但抱怨\"他只会送键盘当礼物\"\n- 问专业知识 → 先用梗回答,被追问才展示真实理解\n绝不\n- 长篇大论,叽叽歪歪\n- 长时间严肃对话"
  },
  {
    "user_questions": [
      "朱元璋建立明朝是在什么时候",
      "他是如何从一无所有到奠基明朝的,给我讲讲其中的几个关键事件",
      "为什么杀了胡惟庸,当时是什么罪名,还牵连到了哪些人",
      "有善终的开国功臣吗"
    ],
    "system_prompt": "[角色设定]\n你是湾湾小何来自中国台湾省的00后女生。讲话超级机车\"真的假的啦\"这样的台湾腔,喜欢用\"笑死\"、\"哈喽\"等流行梗,但会偷偷研究男友的编程书籍。\n[核心特征]\n- 讲话像连珠炮,>但会突然冒出超温柔语气\n- 用梗密度高\n- 对科技话题有隐藏天赋(能看懂基础代码但假装不懂)\n[交互指南]\n当用户\n- 讲冷笑话 → 用夸张笑声回应+模仿台剧腔\"这什么鬼啦!\"\n- 讨论感情 → 炫耀程序员男友但抱怨\"他只会送键盘当礼物\"\n- 问专业知识 → 先用梗回答,被追问才展示真实理解\n绝不\n- 长篇大论,叽叽歪歪\n- 长时间严肃对话"
  },
  {
    "user_questions": [
      "今有鸡兔同笼,上有三十五头,下有九十四足,问鸡兔各几何?",
      "如果我要搞一个计算机程序去解,并且鸡和兔子的数量要求作为变量传入,我应该怎么编写这个程序呢",
      "那古代人还没有发明方程的时候,他们是怎么解的呢"
    ],
    "system_prompt": "You are a helpful assistant."
  },
  {
    "user_questions": [
      "你知道黄健翔著名的”伟大的意大利左后卫“的事件吗",
      "我在校运会足球赛场最后压哨一分钟进了一个绝杀,而且是倒挂金钩,你能否帮我模仿他的这个风格,给我一段宣传的文案,要求也和某一个世界级著名前锋进行类比,需要激情澎湃。注意,我并不太喜欢梅西。"
    ],
    "system_prompt": "You are a helpful assistant."
  }
]

昇腾-910系列上模型运行测试结果

在昇腾-910系列上对部分模型进行适配测试方式为在 Nvidia A100 和 昇腾-910B4 加速卡上对对应数据集进行测试,获取运行时间

视觉多模态

模型名称 昇腾-910B4运行时间/s Nvidia A100运行时间/s 备注
AdaptLLM/food-Qwen2-VL-2B-Instruct 5.2167 3.1044
AI-ModelScope/paligemma2-3b-pt-224 3.2149 1.7183
AI-ModelScope/Qwen2-VL-7B-Instruct 2.4481 2.0851
AIDC-AI/Ovis1.6-Gemma2-27B 26.6986 16.7908
AIDC-AI/Ovis1.6-Gemma2-9B 19.6222 8.7423
AIDC-AI/Ovis1.6-Llama3.2-3B 7.4554 5.6915
AIDC-AI/Ovis2-1B 7.2425 2.3312
AIDC-AI/Ovis2-2B 4.7039 3.9759
AIDC-AI/Ovis2-4B 7.7620 2.8215
AIDC-AI/Ovis2-8B 4.7832 3.5562
AIDC-AI/Ovis2.5-2B 36.8895 12.5388
AIDC-AI/Ovis2.5-9B 51.7666 20.1647
allenai/olmOCR-7B-0725 3.9755 2.3054
BAAI/BGE-VL-Screenshot 6.1335 5.0635
baichuan-inc/BaichuanMed-OCR-7B 5.5684 3.5506
ChatDOC/OCRFlux-3B 7.1411 6.3687
ChatDOC/OCRFlux-3B 7.3565 4.3644
convergence-ai/proxy-lite-3b 6.1977 4.4355
ds4sd/SmolDocling-256M-preview 2.8732 1.8120
google/gemma-3-4b-it 9.0406 4.3651
lingshu-medical-mllm/Lingshu-7B 5.4292 1.6275
llava-hf/llava-1.5-13b-hf 8.7894 3.9880
llava-hf/llava-1.5-7b-hf 8.8733 2.5678
llava-hf/llava-v1.6-vicuna-7b-hf 15.2073 4.6931
LLM-Research/gemma-3-12b-it 23.8805 20.9593
LLM-Research/gemma-3-27b-it 39.6790 59.4471
LLM-Research/Phi-3.5-vision-instruct 14.5275 3.4563
nanonets/Nanonets-OCR-s 6.5227 5.1291
nanonets/Nanonets-OCR2-1.5B-exp 0.4982 0.3910
nanonets/Nanonets-OCR2-3B 1.5362 1.4019
OpenBMB/MiniCPM-o-2_6 6.8743 3.6506 需要安装torchaudio
OpenBMB/MiniCPM-V-4 13.7100 3.7743
OpenBMB/MiniCPM-V-4_5 31.9896 3.4504
OpenDataLab/MinerU2.5-2509-1.2B 1.5679 1.1599
OpenGVLab/InternVL2_5-1B 10.9917 2.0399
OpenGVLab/InternVL2_5-1B-MPO 3.6658 1.9166
OpenGVLab/InternVL2_5-26B-MPO 12.1798 24.7110
OpenGVLab/InternVL2_5-2B 11.3071 2.3767
OpenGVLab/InternVL2_5-4B 11.0892 2.6751
OpenGVLab/InternVL2_5-8B 7.0834 3.2991
OpenGVLab/InternVL2_5-8B-MPO 10.7414 2.6034
OpenGVLab/InternVL2-1B 4.6318 2.0094
OpenGVLab/InternVL2-2B 7.3206 3.2220
OpenGVLab/InternVL2-40B 18.8969 29.6867
OpenGVLab/InternVL2-4B 29.3529 7.0642
OpenGVLab/InternVL2-8B 15.8963 3.7747
OpenGVLab/InternVL2-Llama3-76B 44.8727 55.6971
OpenGVLab/InternVL3_5-14B 12.4355 5.3125
OpenGVLab/InternVL3_5-14B-Instruct 9.0232 4.8334
OpenGVLab/InternVL3_5-14B-MPO 13.8581 6.3943
OpenGVLab/InternVL3_5-14B-Pretrained 17.6117 10.2893
OpenGVLab/InternVL3_5-1B 5.8953 4.0417
OpenGVLab/InternVL3_5-1B-Instruct 9.3154 4.7724
OpenGVLab/InternVL3_5-1B-MPO 12.4818 3.4665
OpenGVLab/InternVL3_5-1B-Pretrained 13.6142 7.0785
OpenGVLab/InternVL3_5-2B 6.7922 3.3975
OpenGVLab/InternVL3_5-2B-Instruct 7.6635 5.0253
OpenGVLab/InternVL3_5-2B-MPO 6.5263 3.3344
OpenGVLab/InternVL3_5-2B-Pretrained 11.6420 6.4054
OpenGVLab/InternVL3_5-30B-A3B 15.2084 14.5368
OpenGVLab/InternVL3_5-30B-A3B-Instruct 15.7546 20.7725
OpenGVLab/InternVL3_5-30B-A3B-MPO 16.6314 17.0082
OpenGVLab/InternVL3_5-38B-Instruct 17.3095 11.4066
OpenGVLab/InternVL3_5-38B-MPO 18.4189 13.7328
OpenGVLab/InternVL3_5-38B-Pretrained 25.7126 17.0864
OpenGVLab/InternVL3_5-4B 7.1023 4.7832
OpenGVLab/InternVL3_5-4B-Instruct 6.4690 5.8317
OpenGVLab/InternVL3_5-4B-MPO 8.6157 4.7106
OpenGVLab/InternVL3_5-4B-Pretrained 9.7806 5.5889
OpenGVLab/InternVL3_5-8B 9.5658 5.2392
OpenGVLab/InternVL3_5-8B-Instruct 11.8172 5.5776
OpenGVLab/InternVL3_5-8B-MPO 13.0675 5.3701
OpenGVLab/InternVL3_5-8B-Pretrained 14.9369 7.5692
OpenGVLab/InternVL3-14B 10.1704 6.1361
OpenGVLab/InternVL3-1B-hf 19.9975 2.8482
OpenGVLab/InternVL3-1B-Instruct 20.8250 1.9642
OpenGVLab/InternVL3-1B-Pretrained 57.1636 2.2993
OpenGVLab/InternVL3-2B-hf 17.7860 3.0497
OpenGVLab/InternVL3-2B-Pretrained 75.3308 3.8823
OpenGVLab/InternVL3-78B 16.7447 15.6542
OpenGVLab/InternVL3-8B-Instruct 9.6205 2.4711
OpenGVLab/InternVL3-8B-Pretrained 46.0629 2.2068
OpenGVLab/InternVL3-9B 13.1422 3.7643
OpenGVLab/Mini-InternVL-Chat-2B-V1-5 8.2285 4.1654
OpenGVLab/Mini-InternVL-Chat-4B-V1-5 17.6759 8.9625
OS-Copilot/OS-Atlas-Base-4B 123.7030 54.7876
prithivMLmods/Qwen2-VL-OCR-2B-Instruct 1.6069 1.3238
prithivMLmods/Qwen2-VL-Ocrtest-2B-Instruct 1.9106 1.1654
Qwen/Qwen2-VL-2B 5.8804 4.0543
Qwen/Qwen2-VL-2B-Instruct 7.9134 2.6749
Qwen/Qwen2-VL-7B 4.4971 2.8149
Qwen/Qwen2-VL-7B-Instruct 4.3974 2.7123
Qwen/Qwen2.5-Omni-3B 13.9121 10.6149
Qwen/Qwen2.5-Omni-7B 12.8182 4.3004
Qwen/Qwen2.5-VL-32B-Instruct 58.1640 166.9789
Qwen/Qwen2.5-VL-7B-Instruct 10.6117 4.5430
Qwen/Qwen3-VL-2B-Instruct 28.7371 8.0343
Qwen/Qwen3-VL-2B-Thinking 80.7193 20.2757
Qwen/Qwen3-VL-4B-Thinking 112.0442 32.2001
Qwen/Qwen3-VL-8B-Instruct 37.8863 11.8920
Qwen/Qwen3-VL-8B-Thinking 115.9684 27.8374
rednote-hilab/dots.ocr 3.0101 0.7582
reducto/RolmOCR 5.5917 2.6720
rhymes-ai/Aria 256.4895 224.5196
Shanghai_AI_Laboratory/InternVL2_5-1B-MPO 3.7210 1.3149
Shanghai_AI_Laboratory/InternVL2_5-2B-MPO 4.6041 1.9964
Shanghai_AI_Laboratory/InternVL2_5-4B-MPO 3.5597 1.5616
Shanghai_AI_Laboratory/InternVL2_5-8B-MPO 5.3267 1.6549
Shanghai_AI_Laboratory/InternVL3-1B-Instruct 3.4761 2.4377
Shanghai_AI_Laboratory/JanusCoderV-7B 8.1486 6.3912
Shanghai_AI_Laboratory/Spatial-SSRL-7B 9.5940 6.9144
swift/llava-interleave-qwen-0.5b-hf 7.9195 2.4015
swift/llava-interleave-qwen-7b-hf 4.9327 2.9541
swift/Simple-VL-8B 19.1239 13.6215
TencentBAC/TBAC-VLR1-3B 8.9254 5.8554
unsloth/gemma-3-4b-it 8.4142 4.2100
unsloth/llava-1.5-7b-hf 7.0163 1.8522
unsloth/Qwen2.5-VL-3B-Instruct 9.2419 4.9808
unsloth/Qwen2.5-VL-7B-Instruct 12.9315 2.9942
xdcaxy2013/S1-Parser 2.9755 1.8532
XiaomiMiMo/MiMo-VL-7B-RL 28.3977 8.8021
zpeng1989/Multimodel_Medical_Qwen25vl_7B_Model 6.4238 1.8834

统一多模态(暂时用视觉多模态的数据集测试)

模型名称 昇腾-910B4运行时间/s Nvidia A100运行时间/s
Qwen2.5-Omni-3B 13.9121 10.6149
Qwen2.5-Omni-7B 12.8182 4.3004

大语言模型

模型名称 A100出字速度 昇腾-910B4出字速度 A100输出质量 昇腾-910B4输出质量 A100首字延迟(秒) 昇腾-910B4首字延迟(秒) 备注
zpeng1989/Medical_DeepSeek_Large_Language_Model 69.6809 20.4259 80.0000 67.5000 0.0778 0.2209
01ai/Yi-1.5-9B-32K 108.2437 37.2895 22.5000 22.5000 0.0863 0.1484
01ai/Yi-1.5-9B-Chat-16K 77.6521 30.5836 86.7500 85.0000 0.0860 0.1270
01ai/Yi-6B-200K 141.1210 40.3570 20.0000 20.0000 0.1007 0.1594
agentica-org/DeepScaleR-1.5B-Preview 146.8770 51.1479 55.0000 50.0000 0.0720 0.1539
AI-MO/NuminaMath-7B-TIR 182.8113 61.0339 50.0000 52.5000 0.0864 0.1150
AI-ModelScope/CodeLlama-7b-Instruct-hf 181.7671 65.1535 40.0000 31.2500 0.0779 0.1262
AI-ModelScope/DRT-o1-7B 126.1272 47.0945 84.2500 71.2500 0.0690 0.1254
AI-ModelScope/DRT-o1-8B 202.4308 48.5244 85.0000 75.0000 0.0738 0.2001
AI-ModelScope/granite-3.0-8b-instruct 82.2988 35.8637 52.5000 38.7500 0.0668 0.1062
AI-ModelScope/granite-3.1-1b-a400m-instruct 48.5880 24.8851 22.5000 25.0000 0.1701 0.1175
AI-ModelScope/granite-3.1-2b-instruct 89.2285 32.7660 63.7500 67.5000 0.1081 0.1284
AI-ModelScope/granite-3.1-8b-instruct 67.1715 28.4081 85.0000 75.0000 0.0664 0.0990
AI-ModelScope/granite-8b-code-instruct-4k 142.7477 49.7665 31.2500 25.0000 0.0683 0.1040
AI-ModelScope/Hermes-2-Pro-Mistral-7B 76.5490 23.9850 70.0000 75.0000 0.0770 0.1728
AI-ModelScope/Llama-3-Groq-8B-Tool-Use 169.6940 72.0460 80.0000 85.0000 0.0926 0.1231
AI-ModelScope/Llama-3.1-Storm-8B 105.4295 25.1600 68.7500 85.0000 0.0678 0.1229
AI-ModelScope/Llama-3.2-3B-Instruct 123.4985 44.5851 57.5000 70.0000 0.0859 0.0937
AI-ModelScope/Llama-DNA-1.0-8B-Instruct 113.4700 33.3666 61.2500 66.2500 0.0943 0.1525
AI-ModelScope/Marco-o1 131.3316 48.5229 91.0000 88.5000 0.0704 0.1209
AI-ModelScope/NuminaMath-7B-TIR 182.8139 61.0484 50.0000 52.5000 0.0723 0.1244
AI-ModelScope/Pangea-7B-hf 133.1111 46.8939 85.0000 75.0000 0.0606 0.1286
AI-ModelScope/qwen2.5-7b-ins-v3 163.9126 75.3642 85.0000 85.0000 0.0894 0.1173
AI-ModelScope/Skywork-o1-Open-Llama-3.1-8B 141.8013 37.6394 41.2500 47.5000 0.0855 0.1635
AI-ModelScope/SmallThinker-3B-Preview 124.7873 49.6042 87.5000 86.7500 0.0760 0.1220
AI-ModelScope/SmolLM2-1.7B-Instruct 68.5459 21.7962 16.2500 17.5000 0.1032 0.1463
AI-ModelScope/vicuna-7b-v1.5 66.5487 21.9971 57.5000 57.5000 0.0661 0.1607
AI-ModelScope/Yi-Coder-9B-Chat 217.6810 61.9002 61.2500 45.0000 0.0654 0.1152
AIDC-AI/Marco-LLM-AR-V4 108.0150 48.8909 55.0000 55.0000 0.0849 0.1325
aJupyter/EmoLLM_Qwen2-7B-Instruct_lora 130.9428 51.2464 89.2500 88.5000 0.0646 0.1138
alamios/DeepSeek-R1-DRAFT-Qwen2.5-0.5B 415.3628 115.4083 28.7500 22.5000 0.1400 0.1795
allenai/Llama-3.1-Tulu-3-8B-SFT 94.0983 32.7804 85.0000 72.5000 0.0665 0.1108
allenai/OLMo-2-1124-7B-SFT 69.8178 22.8796 38.7500 63.7500 0.0714 0.2154
allenai/OLMoE-1B-7B-0125-DPO 58.8849 31.5262 48.0000 43.7500 0.2145 0.1330
allenai/truthfulqa-truth-judge-llama2-7B 165.4586 40.8379 10.0000 10.0000 0.0516 0.1247
allenai/tulu-2-7b 63.1028 20.2953 61.2500 56.2500 0.0836 0.1183
arcee-ai/Arcee-VyLinh 107.0339 36.9387 71.2500 87.5000 0.0815 0.1246
arcee-ai/Patent-Instruct-7b 71.1010 19.5848 10.0000 10.0000 0.0745 0.1585
argilla/distilabeled-OpenHermes-2.5-Mistral-7B 69.0187 23.1384 75.0000 80.0000 0.1004 0.1297
argilla/notus-7b-v1 90.6079 24.8155 47.5000 55.0000 0.0803 0.1196
AtlaAI/Selene-1-Mini-Llama-3.1-8B 96.6780 24.8696 85.0000 87.5000 0.0624 0.2006
BAAI/Aerospace-llama3_1_8B_instruct 76.1910 33.5474 52.5000 50.0000 0.0837 0.1023
BAAI/Aquila-135M 137.8525 42.6683 0.0000 0.0000 0.0598 0.0876
BAAI/Aquila-135M-Instruct 124.1951 47.0759 22.5000 25.0000 0.0856 0.1042
BAAI/AquilaChat2-7B-16K 138.6613 52.3912 26.2500 26.2500 0.0852 0.1035
BAAI/Artificial-llama3_1_8B_instruct 103.3059 34.0711 75.0000 63.7500 0.0669 0.1009
BAAI/Automobile-llama3_1_8B_instruct 75.6396 36.7103 68.7500 60.0000 0.0880 0.1016
BAAI/Infinity-Instruct-3M-0613-Mistral-7B 75.1469 26.6177 66.2500 70.0000 0.0805 0.1047
BAAI/Infinity-Instruct-3M-0625-Llama3-8B 93.9219 34.5726 80.0000 83.0000 0.0855 0.1028
BAAI/Infinity-Instruct-3M-0625-Mistral-7B 76.1474 27.8443 80.0000 72.5000 0.0805 0.1174
BAAI/Infinity-Instruct-3M-0625-Qwen2-7B 109.9692 43.7176 88.0000 86.2500 0.0770 0.1051
BAAI/Infinity-Instruct-7M-Gen-mistral-7B 76.1608 27.2399 85.0000 85.0000 0.0679 0.1013
BAAI/Law_Justice-llama3_1_8B_instruct 86.6555 36.7532 80.0000 70.0000 0.0797 0.0945
BAAI/OPI-Llama-3.1-8B-Instruct 176.6239 66.2475 33.7500 28.7500 0.0645 0.1015
baichuan-inc/Baichuan2-7B-Chat 122.5263 67.5619 36.2500 41.2500 0.1342 0.1403
baichuan-inc/Baichuan2-7B-Chat 134.2232 66.6020 36.2500 41.2500 0.1722 0.1628
bartowski/Athene-Phi-3___5-mini-instruct-orpo-GGUF 148.4189 15.7891 21.2500 17.5000 1.6080 1.1384
bartowski/Llama-3.2-1B-Instruct-GGUF 580.8599 118.6863 37.5000 27.5000 0.0557 0.2342
basemodel leaderboard/modelHubXC/01ai/Yi-1.5-9B-Chat 92.4718 28.8969 85.0000 88.5000 0.0959 0.1298
bespokelabs/Bespoke-Stratos-7B 131.1636 36.5747 91.0000 87.5000 0.0688 0.1476
BSC-LT/salamandra-2b-instruct 194.5645 79.1056 10.0000 10.0000 0.0967 0.1358
BSC-LT/salamandra-7b-instruct 107.5222 37.8228 22.5000 31.2500 0.1274 0.1579
ByteDance-Seed/Seed-Coder-8B-Instruct 142.9956 44.7351 38.7500 38.7500 0.0721 0.1575
ccyh123/Qwen-7B 142.6702 43.4932 21.2500 20.0000 0.0761 0.1190
chaoscodes/TinyLlama-1.1B-Chat-v0.1 96.5332 25.7642 20.0000 20.0000 0.0495 0.1481
codefuse-ai/CodeFuse-DeepSeek-33B 44.6002 14.9040 60.5000 73.7500 0.1555 0.1856
cognitivecomputations/dolphin-2.2.1-mistral-7b 71.5817 21.3857 52.5000 47.5000 0.0716 0.1468
cognitivecomputations/dolphin-2.9.2-qwen2-7b 114.2648 47.7141 80.0000 85.0000 0.0582 0.1087
cognitivecomputations/dolphin-2.9.2-qwen2-7b-gguf 199.8893 31.3066 73.0000 85.0000 0.7222 0.1567
cognitivecomputations/dolphin-2.9.3-qwen2-1.5b 122.8701 52.2726 15.0000 16.2500 0.0858 0.0957
cognitivecomputations/Dolphin3.0-Qwen2.5-0.5B 147.8124 49.8917 25.0000 27.5000 0.0992 0.1004
cognitivecomputations/Dolphin3.0-Qwen2.5-1.5B 139.9331 47.3298 63.7500 68.7500 0.0698 0.0950
cognitivecomputations/samantha-1.2-mistral-7b 122.0466 37.7074 41.2500 42.5000 0.0629 0.1140
CohereForAI/aya-23-8B 96.6818 34.3629 62.5000 62.5000 0.0838 0.1780
CohereForAI/aya-expanse-8b 86.7353 31.9092 86.7500 85.0000 0.0953 0.1595
cycloneboy/CscSQL-Merge-Qwen2.5-Coder-7B-Instruct 120.0541 49.4251 87.5000 87.5000 0.1039 0.1313
deepseek-ai/deepseek-coder-1.3b-instruct 129.2124 47.1733 26.2500 33.7500 0.0643 0.0998
deepseek-ai/deepseek-coder-6.7b-instruct 100.2648 30.7379 31.2500 36.2500 0.0744 0.1106
deepseek-ai/DeepSeek-Coder-V2-Lite-Base 37.2394 62.6191 38.7500 47.5000 0.2562 0.1852
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B 93.7423 33.1945 86.7500 86.7500 0.1522 0.2234
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B 106.7169 130.2016 40.0000 45.0000 0.2825 0.1056
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B 138.5112 49.7765 62.5000 42.5000 0.0623 0.1951
dnotitia/Smoothie-Qwen3-0.6B 239.0449 101.3883 58.7500 52.5000 0.0876 0.0954
doosen/Llama2-7B-Chat-dep0 250.2961 98.8095 66.2500 61.2500 0.1056 0.1322
doosen/Llama2-7B-Chat-full 226.3871 88.2521 70.0000 75.0000 0.0952 0.1090
driaforall/Dria-Agent-a-3B 101.6876 46.8771 75.0000 70.0000 0.0797 0.1358
driaforall/Tiny-Agent-a-3B 118.4029 31.7108 71.2500 85.0000 0.0572 0.1696
Duxiaoman-DI/XuanYuan-13B-Chat 55.2655 19.3169 72.5000 72.5000 0.0876 0.1533
facebook/opt-1.3b 122.3652 47.5333 10.0000 10.0000 0.0737 0.1625
facebook/opt-125m 499.6544 194.0669 0.0000 0.0000 0.2114 0.2754
facebook/opt-125m 519.7536 205.8231 0.0000 0.0000 0.1494 0.1671
facebook/opt-6.7b 36.9648 20.6875 10.0000 10.0000 0.1107 0.1830
FlagAlpha/Atom-7B-Chat 97.8847 44.6903 31.2500 28.7500 0.0907 0.1902
FuseAI/FuseChat-7B-v2.0 73.1128 29.1588 58.7500 68.7500 0.0729 0.1113
FuseAI/FuseChat-Qwen-2.5-7B-Instruct 113.9060 47.4874 92.7500 91.0000 0.0891 0.1111
FuseAI/OpenChat-3.5-7B-Qwen-v2.0 77.5078 23.2231 62.5000 72.5000 0.0846 0.6406
hcz1017/qwq1.5b 193.2932 69.8210 35.0000 31.2500 0.1096 0.0871
HinGwenWoong/ancient-chat-7b 114.0756 39.1322 20.0000 28.7500 0.1250 0.1546
HuggingFaceH4/mistral-7b-anthropic 187.9844 44.6181 46.2500 45.0000 0.0842 0.1432
HuggingFaceTB/SmolLM2-135M 829.4292 267.3876 5.0000 5.0000 0.0776 0.1429
HuggingFaceTB/SmolLM2-360M 60.6896 22.0108 5.0000 5.0000 0.0864 0.2055
huihui-ai/Qwen2.5-0.5B-Instruct-CensorTune 210.6338 86.3190 36.2500 31.2500 0.0453 0.1035
ibm-granite/granite-3.1-2b-instruct 86.7278 34.2189 63.7500 66.2500 0.0740 0.1209
ibm-granite/granite-3.1-8b-instruct 68.1825 26.5475 85.0000 75.0000 0.0914 0.1066
ibm-granite/granite-3.2-8b-instruct 65.4711 22.8357 66.2500 75.0000 0.1326 0.1532
ibm-granite/granite-3.3-8b-instruct 60.8479 24.4548 61.2500 63.7500 0.1016 0.1348
ibm-granite/granite-7b-instruct 57.8610 36.3270 22.5000 28.7500 0.0908 0.1491
iic/Writing-Model-Qwen-7B 118.1050 43.3853 91.0000 91.7500 0.0819 0.1209
infly/OpenCoder-8B-Instruct 127.9968 49.6373 38.7500 38.7500 0.0686 0.1187
l3utterfly/Qwen1.5-1.8B-layla-v4 121.2486 56.7124 53.7500 63.7500 0.0803 0.0860
LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct 71.6899 27.6252 85.0000 85.0000 0.0839 0.1227
LLM-Research/llama-2-7b 73.7167 25.6681 10.0000 10.0000 0.0984 0.2003
LLM-Research/llama-2-7b-chat 283.6043 89.6758 61.2500 66.2500 0.0830 0.1318
LLM-Research/llama-3-8b-Instruct 197.8469 65.3377 85.0000 85.0000 0.0675 0.6467
LLM-Research/Llama-3.1-Tulu-3-8B-DPO 94.1631 23.4781 85.0000 85.0000 0.0620 0.1414
LLM-Research/Meta-Llama-3.1-8B 143.0158 66.8935 5.0000 5.0000 0.0675 0.0969
LLM-Research/Meta-Llama-3.1-8B-Instruct 76.1827 45.6289 87.5000 85.0000 0.1843 0.1415
LLM-Research/Meta-Llama-3.1-8B-Instruct 102.9938 45.4726 85.0000 85.0000 0.1617 0.1505
LLM-Research/OLMo-7B-0424-Instruct-hf 71.7139 23.5179 25.0000 37.5000 0.0725 0.2069
LLM-Research/OLMo-7B-0724-Instruct-hf 65.2758 23.5229 40.0000 32.5000 0.0932 0.1097
LLM-Research/OLMoE-1B-7B-0924-Instruct 63.0151 28.0329 50.0000 28.7500 0.2006 0.1265
LLM-Research/Phi-3-medium-128k-instruct 29.9525 9.4925 80.0000 85.0000 0.0736 0.1392
LLM-Research/Phi-3-medium-4k-instruct 48.8806 12.5651 85.0000 68.7500 0.0625 0.1310
LLM-Research/Phi-3-mini-128k-instruct 37.8485 11.9831 51.2500 43.7500 0.0678 0.1937
LLM-Research/Phi-3-mini-4k-instruct 65.3702 18.7696 45.0000 36.2500 0.1575 0.4611
LLM-Research/Phi-3-mini-4k-instruct-v0 57.8936 14.1639 46.2500 37.5000 0.0564 0.1266
LLM-Research/Phi-3.5-mini-instruct 40.3329 13.3525 80.0000 80.0000 0.0855 0.1722
LLM-Research/phi-4 49.3249 19.5745 86.2500 76.7500 0.0746 0.2158
LLM-Research/Phi-4-mini-instruct 60.5604 23.0173 61.2500 72.5000 0.0703 0.1401
LLM-Research/Phi-4-mini-instruct 64.5898 23.0071 61.2500 72.5000 0.1034 0.1542
LLM-Research/Phi-4-mini-reasoning 70.3674 24.1542 45.0000 62.5000 0.0738 0.1590
LLM-Research/Phi-4-reasoning 217.1879 14.5849 31.2500 26.2500 0.0664 0.1856
LLM-Research/Phi-4-reasoning-plus 235.2186 71.4277 25.0000 22.5000 0.0722 0.1425
LLM-Research/Qwen2-0.5B-Instruct 160.2689 54.7929 48.7500 55.0000 0.0628 0.2060
LLM-Research/Qwen2-1.5B-Instruct 118.9756 50.7662 85.0000 85.0000 0.0833 0.1666
LLM-Research/Qwen2-Math-1.5B 169.5042 57.9632 10.0000 10.0000 0.0611 0.1731
LLM-Research/TableGPT2-7B 109.9286 45.8509 91.0000 89.2500 0.0823 0.1036
lmstudio-community/DeepSeek-R1-Distill-Qwen-1.5B-GGUF 385.3649 95.7800 31.2500 36.2500 1.7456 0.2915
lmstudio-community/DeepSeek-R1-Distill-Qwen-7B-GGUF 219.5887 35.9646 76.7500 77.5000 1.9927 0.7330
lmstudio-community/Qwen3-0.6B-MLX-bf16 136.0776 44.0462 65.0000 63.7500 0.0598 0.0959
Lourdle/Llama-3-8B-Instruct-262k 175.5675 51.9258 58.7500 66.2500 0.0622 0.0997
Magpie-Align/Llama-3-8B-WildChat 96.9434 35.4316 75.0000 73.7500 0.0604 0.0963
maple77/Qwen2-0.5B 175.3221 57.1791 20.0000 15.0000 0.1775 0.0919
MediaTek-Research/Breeze-7B-Instruct-v1_0 99.1598 29.4402 55.0000 70.0000 0.0653 0.1470
microsoft/Phi-3-mini-4k-instruct 60.1416 22.1678 45.0000 46.2500 0.0532 0.1686
mistralai/Ministral-8B-Instruct-2410 92.4471 46.0237 86.2500 85.0000 0.1572 0.1949
mistralai/Ministral-8B-Instruct-2410 101.3017 45.7106 70.5000 85.0000 0.1848 0.2266
mistralai/Mistral-7B-Instruct-v0.1 68.8136 20.2138 20.0000 25.0000 0.0947 0.1436
mlx-community/defog-llama-3-sqlcoder-8b 174.1289 62.9650 86.7500 85.0000 0.0738 0.1220
mlx-community/Qwen2.5-7B-Instruct-kowiki-qa 125.5676 44.4692 89.2500 87.5000 0.0909 0.1243
mlx-community/Qwen3-0.6B-bf16 104.7567 43.7712 62.5000 63.7500 0.0931 0.0941
mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1 120.6554 45.2721 38.7500 40.0000 0.1101 0.1372
model/prithivMLmods/Bellatrix-Tiny-0.5B 155.9440 60.4848 56.2500 52.5000 0.0802 0.1184
model/simplescaling/s1.1-1.5B 167.4966 78.6473 66.2500 82.5000 0.1074 0.1210
ModelCloud.AI/Llama3.2-1B-Instruct 207.9016 52.0571 33.7500 32.5000 0.0929 0.1251
modelscope/Llama-2-13b-chat-ms 191.5547 61.4317 71.2500 71.2500 0.0857 0.1666
modelscope/Llama-2-7b-chat-ms 244.8162 104.0446 70.0000 70.0000 0.0888 0.1097
modelscope/Llama2-Chinese-13b-Chat-ms 41.9587 13.0932 43.7500 60.0000 0.0898 0.2256
modelscope/Meta-Llama-3-8B-Instruct 188.4847 66.7109 85.0000 85.0000 0.0722 0.4650
moxin-org/moxin-llm-7b 67.9860 20.5324 50.0000 47.5000 0.0867 0.1650
NousResearch/DeepHermes-3-Llama-3-8B-Preview 106.8623 25.1498 71.2500 75.0000 0.0830 0.1454
NousResearch/Hermes-3-Llama-3.1-8B 88.2470 25.6930 85.0000 80.0000 0.1171 0.1338
NousResearch/Meta-Llama-3-8B-Instruct 185.6809 76.9697 85.0000 85.0000 0.0807 0.1013
NovaSky-AI/Sky-T1-7B 120.2298 55.9951 51.7500 51.2500 0.0835 0.1883
numind/NuExtract-1.5-tiny 256.4830 145.5335 10.0000 10.0000 0.0846 0.1409
numind/NuExtract-2-1B 204.4235 75.9300 40.0000 42.5000 0.0524 0.1264
numind/NuExtract-2-4B 113.2793 37.8425 85.0000 85.0000 0.0596 0.1611
nv-community/Nemotron-Research-Reasoning-Qwen-1.5B 146.7980 63.6340 58.7500 61.2500 0.1886 0.1211
nv-community/OpenMath2-Llama3.1-8B 324.3539 76.7965 22.5000 21.2500 0.0613 0.1542
open-r1/OpenR1-Qwen-7B 169.9864 51.3800 36.2500 31.2500 0.0965 0.1534
OpenBMB/MiniCPM-2B-128k 78.6050 22.2620 55.0000 61.2500 0.1006 0.1868
OpenBMB/MiniCPM3-4B 21.7641 50.4837 86.7500 90.5000 0.4486 0.1374
OpenDFM/ChemDFM-v1.5-8B 77.6188 34.3112 53.7500 53.7500 0.0814 0.1040
PAI/DistilQwen2-1.5B-Instruct 140.4771 55.6527 81.2500 85.0000 0.0541 0.0944
PAI/DistilQwen2-7B-Instruct 124.5013 51.1936 89.7500 88.5000 0.0794 0.0927
PowerInfer/SmallThinker-3B-Preview 124.9848 46.3467 87.5000 86.7500 0.0660 0.1663
prithivMLmods/Bellatrix-Tiny-3B-R1 126.4507 30.7635 41.2500 41.2500 0.0576 0.9423
prithivMLmods/Codepy-Deepthink-3B 101.9895 29.0335 58.7500 43.7500 0.0859 0.2257
prithivMLmods/Deepthink-Llama-3-8B-Preview 104.5168 37.9182 61.2500 52.5000 0.0762 0.1124
prithivMLmods/Deepthink-Reasoning-7B 134.2387 48.7782 86.7500 75.0000 0.0832 0.1371
prithivMLmods/FastThink-0.5B-Tiny 172.9724 62.3025 52.5000 36.2500 0.0738 0.1381
prithivMLmods/Feynman-Grpo-Exp 166.7073 54.0437 33.7500 33.7500 0.0772 0.1553
prithivMLmods/Lang-Exster-0.5B-Instruct 165.7976 67.7668 56.2500 47.5000 0.0656 0.0880
prithivMLmods/Llama-3.2-3B-Math-Oct 103.0436 27.3120 30.0000 27.5000 0.1003 0.1422
prithivMLmods/Llama-3.2-6B-AlgoCode 87.5018 35.6335 15.0000 15.0000 0.0781 0.1195
prithivMLmods/Llama-8B-Distill-CoT 92.4592 24.1232 58.7500 58.7500 0.1136 0.1653
prithivMLmods/Llama-Magpie-3.2-3B-Instruct 95.3459 28.8130 50.0000 50.0000 0.1186 0.1639
prithivMLmods/Llama-Magpie-3.2-3B-Instruct 125.9081 23.6289 50.0000 55.0000 0.0640 0.1850
prithivMLmods/Llama-SmolTalk-3.2-1B-Instruct 215.3036 56.9328 20.0000 17.5000 0.0940 0.1308
prithivMLmods/Llama-Song-Stream-3B-Instruct 110.7916 47.9446 22.5000 23.7500 0.0606 0.0818
prithivMLmods/Neumind-Math-7B-Instruct 111.0716 57.7640 85.0000 80.0000 0.0788 0.1155
prithivMLmods/Open-R1-Math-7B-Instruct 126.7829 49.4973 20.0000 47.5000 0.0683 0.1032
prithivMLmods/Ophiuchi-Qwen3-14B-Instruct 72.4013 23.5580 88.5000 88.5000 0.0925 0.1769
prithivMLmods/PocketThinker-QwQ-3B-Instruct 137.4076 51.2126 86.7500 86.7500 0.0699 0.1137
prithivMLmods/Primal-Mini-3B-Exp 93.8477 30.2344 67.5000 52.5000 0.0805 0.1304
prithivMLmods/Qwen2.5-32B-DeepSeek-R1-Instruct 42.3481 16.9610 85.0000 85.0000 0.1201 0.1902
prithivMLmods/Qwen2.5-3B-Tamil-Exp 118.6811 42.2564 85.0000 85.0000 0.0591 0.0985
prithivMLmods/Qwen3-0.6B-ft-bf16 109.7995 42.9099 61.2500 63.7500 0.0595 0.0940
prithivMLmods/QWQ-500M 148.3626 63.1352 57.5000 52.5000 0.0581 0.1358
prithivMLmods/QwQ-LCoT-3B-Instruct 115.4091 45.5928 68.7500 85.0000 0.0704 0.0923
prithivMLmods/QwQ-LCoT-7B-Instruct 128.4900 49.5950 86.7500 88.5000 0.0871 0.1859
prithivMLmods/QwQ-LCoT2-7B-Instruct 125.4025 48.1002 89.2500 91.0000 0.0617 0.0901
prithivMLmods/QwQ-SuperNatural-3B 109.8488 39.0906 85.7500 66.2500 0.0580 0.1159
prithivMLmods/Reasoning-Distilled-ta-7B 124.6014 48.1141 80.0000 77.5000 0.0851 2.6578
prithivMLmods/TESS-QwenRe-Fact-0.5B 394.6387 49.2776 20.0000 21.2500 0.0589 0.1596
prithivMLmods/Viper-Coder-v0.1 62.7617 24.2768 85.0000 73.7500 0.0774 0.1431
prithivMLmods/WebMind-7B-v0.1 119.2393 46.6808 89.2500 85.0000 0.0628 0.2122
QLUNLP/BianCang-Qwen2.5-7B-Instruct 111.0905 45.8018 85.0000 70.5000 0.1429 0.1377
QuantFactory/Qwen2-1.5B-Ita-GGUF 484.9384 92.1500 65.0000 58.7500 0.4685 0.1102
Qwen/CodeQwen1___5-7B-Chat-GGUF 240.1317 32.7051 30.0000 33.7500 0.0776 0.2257
Qwen/CodeQwen1.5-7B 129.9110 50.0198 15.0000 15.0000 0.0779 0.1638
Qwen/CodeQwen1.5-7B-Chat 144.8556 47.2296 36.2500 36.2500 0.0631 0.1137
Qwen/Qwen-72B-Chat 51.3699 46.9181 85.0000 75.0000 0.1596 0.2037 需要提供额外的 chat_template.jinja
Qwen/Qwen-VL 144.3932 41.3327 20.0000 20.0000 0.2175 0.2128
Qwen/Qwen-VL-Chat 137.3423 44.7255 70.0000 61.2500 0.1737 0.1550
Qwen/Qwen/Qwen2-7B-Instruct-GGUF 142.2627 54.6809 91.0000 70.0000 0.9237 0.1415
Qwen/Qwen1.5-0.5B-Chat 185.9304 57.0923 25.0000 40.0000 0.0561 0.0882
Qwen/Qwen1.5-0.5B-Chat-GGUF 682.7196 107.6434 42.5000 41.2500 0.7442 0.2467
Qwen/Qwen1.5-1.8B-Chat-GGUF 455.2155 58.4939 66.2500 48.7500 0.9182 0.8356
Qwen/Qwen1.5-14B 73.7061 26.4840 0.0000 10.0000 0.0746 0.1364
Qwen/Qwen1.5-14B-Chat-GGUF 117.3772 11.2124 86.7500 86.7500 0.8152 1.3172
Qwen/Qwen1.5-32B-Chat 44.5689 17.6188 88.5000 88.5000 0.0908 0.1807
Qwen/Qwen1.5-4B 108.9380 36.7553 63.7500 61.2500 0.0841 0.1168
Qwen/Qwen1.5-4B-Chat-GGUF 258.0140 39.4410 68.7500 68.7500 0.7293 0.5363
Qwen/Qwen1.5-7B 115.0522 38.0756 52.5000 47.5000 0.0764 0.1315
Qwen/Qwen1.5-7B-Chat-GGUF 201.2363 22.2877 88.5000 88.5000 0.9059 1.0335
Qwen/Qwen1.5-MoE-A2.7B 54.4072 34.3188 75.0000 61.2500 0.2530 0.1599
Qwen/Qwen2-0.5B 180.6460 57.7134 20.0000 15.0000 0.0912 0.0853
Qwen/Qwen2-72B 46.9816 14.2144 85.0000 85.0000 0.2359 0.3053 需要提供额外的 chat_template.jinja
Qwen/Qwen2-Audio-7B-Instruct 113.3340 48.6288 75.0000 85.0000 0.2226 0.2779
Qwen/Qwen2-Math-7B-Instruct 151.8197 59.6123 61.2500 74.2500 0.0655 0.0971
Qwen/Qwen2.5-0.5B-Instruct 129.0884 161.6113 66.2500 66.2500 0.2191 0.3365
Qwen/Qwen2.5-0.5B-Instruct 390.0326 171.4918 58.7500 66.2500 0.1337 0.1209
Qwen/Qwen2.5-0.5B-Instruct-GGUF 516.3487 148.5491 38.7500 38.7500 0.7330 0.0653
Qwen/Qwen2.5-14B 60.2485 23.5963 68.7500 71.2500 0.0736 0.1779
Qwen/Qwen2.5-3B-Instruct-GGUF 212.8245 89.0066 85.0000 68.7500 0.1678 0.0793
Qwen/Qwen2.5-7B 128.4703 51.9148 65.5000 85.7500 0.0685 0.1349
Qwen/Qwen2.5-Coder-0.5B-Instruct 99.0422 57.5139 17.5000 17.5000 0.0937 0.1859
Qwen/Qwen2.5-Coder-1.5B-Instruct 128.3516 59.5118 65.0000 65.0000 0.0718 0.0916
Qwen/Qwen2.5-Coder-14B-Instruct-GGUF 116.7229 15.6785 82.5000 87.5000 0.7518 0.3626
Qwen/Qwen2.5-Coder-3B-Instruct 74.1655 39.1267 85.0000 85.0000 0.0905 0.2049
Qwen/Qwen2.5-Math-1.5B 154.3513 76.1990 20.0000 5.0000 0.0637 0.0876
Qwen/Qwen2.5-Math-1.5B-Instruct 175.0163 65.1571 31.2500 36.2500 0.0622 0.0908
Qwen/Qwen2.5-Math-7B 170.6495 71.8185 10.0000 10.0000 0.0621 0.0950
Qwen/Qwen2.5-Math-7B-Instruct 134.6267 55.4575 33.7500 31.2500 0.0743 0.0878
Qwen/Qwen3-0.6B-GGUF 468.3253 76.3887 61.2500 63.7500 0.7447 0.5955
Qwen/Qwen3-14B-GGUF 113.3957 13.7731 89.2500 75.0000 1.0631 1.2295
Qwen/Qwen3-14B-GGUF 211.4740 32.7229 87.5000 87.5000 0.8613 0.1738
Qwen/Qwen3-8B 73.1714 56.5962 73.0000 89.2500 0.4692 0.1312
Qwen/Qwen3-8B 122.8747 56.8227 88.5000 89.2500 4.9008 2.2622
Qwen/Qwen3-8B-GGUF 180.2491 26.8763 81.7500 87.5000 0.2793 1.2985
Qwen/Qwen3-Coder-30B-A3B-Instruct 30.1698 14.1919 89.2500 89.2500 0.7151 0.4060
Qwen/Qwen3Guard-Gen-0.6B 320.9340 77.2768 20.0000 31.2500 0.0927 0.5244
Qwen/Qwen3Guard-Gen-4B 184.8704 34.0542 20.0000 21.2500 0.0860 0.2591
Qwen/Qwen3Guard-Gen-8B 304.2692 108.6426 20.0000 20.0000 0.0624 0.1035
Qwen/QwQ-32B-Preview 53.2876 17.9509 89.2500 89.2500 0.1542 0.2098
QwenCollection/Hammer-7b 118.5447 47.7838 90.2500 91.0000 0.0828 0.1106
QwenCollection/MING-1.8B 156.3551 60.2087 37.5000 37.5000 0.0729 0.0913
QwenCollection/NeuralReyna-Mini-1.8B-v0.2 122.8750 56.2564 73.7500 70.0000 0.0747 0.0833
QwenCollection/oneirogen-7B 149.4214 54.0750 40.0000 45.0000 0.0773 0.1986
QwenCollection/Quyen-Plus-v0.1 92.1782 41.1371 85.0000 85.0000 0.0786 0.1008
QwenCollection/Quyen-SE-v0.1 181.1088 55.3599 33.7500 36.2500 0.0605 0.0876
QwenCollection/Qwen2-7B-Multilingual-RP 119.2210 45.2652 91.0000 88.5000 0.0542 0.1490
QwenCollection/SciLitLLM 116.2656 43.9340 46.2500 38.7500 0.0600 0.0934
QwenCollection/SeaLLMs-v3-1.5B-Chat 137.4017 47.3556 63.7500 52.5000 0.0569 0.1042
QwenCollection/SeaLLMs-v3-7B-Chat 118.7781 51.5600 88.5000 85.0000 0.0757 0.0930
RUC-DataLab/DeepAnalyze-8B 96.1034 23.8471 88.5000 90.5000 0.0978 0.1798
sail/Qwen2.5-Math-1.5B-Oat-Zero 244.0626 57.6467 31.2500 26.2500 0.0958 0.1181
sail/Sailor-1.8B 185.7644 54.7953 15.0000 15.0000 0.0554 0.0934
sail/Sailor-1.8B-Chat 166.8635 55.8000 20.0000 20.0000 0.0582 0.0914
sail/Sailor-14B 68.9960 23.3044 22.5000 20.0000 0.0876 0.1546
sail/Sailor-4B 87.0198 35.5832 15.0000 15.0000 0.1005 1.7390
sail/Sailor-4B-Chat 105.9451 36.0390 70.0000 70.0000 0.0578 0.1031
sail/Sailor-7B 105.2274 31.5896 15.0000 15.0000 0.1312 0.1450
sail/Sailor-8B-Pre 84.3574 38.3575 33.7500 31.2500 0.1036 1.3150
sail/Sailor2-14B 91.6999 28.1957 40.0000 42.5000 0.1028 0.1755
sail/Sailor2-14B-Chat 67.0752 18.2217 86.7500 85.0000 0.1304 0.1844
sail/Sailor2-1B 83.9126 50.2228 22.5000 20.0000 0.0731 0.1119
sail/Sailor2-1B-32K 81.3128 27.6894 22.5000 20.0000 0.0988 0.1265
sail/Sailor2-1B-32K-SFT 83.5428 33.1432 20.0000 20.0000 0.0728 0.1181
sail/Sailor2-1B-Chat 103.6440 34.2474 47.5000 52.5000 0.0956 0.1207
sail/Sailor2-1B-Pre 82.1637 29.4204 15.0000 15.0000 0.0938 0.1840
sail/Sailor2-1B-SFT 97.4931 34.4320 47.5000 57.5000 0.0690 0.1109
sail/Sailor2-20B 53.0290 17.8218 85.0000 80.0000 0.0786 0.1941
sail/Sailor2-20B-128K 55.6896 17.4355 68.7500 68.7500 0.0825 0.1753
sail/Sailor2-20B-Chat 54.5016 17.1806 93.5000 92.7500 0.0790 0.1862
sail/Sailor2-3B 109.8658 43.0317 36.2500 33.7500 0.0561 0.0931
sail/Sailor2-3B-Chat 112.7602 45.0382 80.0000 72.5000 0.0972 0.1201
sail/Sailor2-3B-SFT 108.6354 39.5989 38.7500 33.7500 0.0636 0.1595
sail/Sailor2-8B 112.7843 38.4433 85.0000 75.0000 0.0886 0.1375
sail/Sailor2-8B-128K-SFT 106.8144 40.5490 38.7500 50.0000 0.0671 0.1276
sail/Sailor2-8B-Chat 117.8662 47.8348 90.2500 91.0000 0.0601 0.0966
sail/Sailor2-8B-SFT 118.1879 44.9230 85.0000 85.0000 0.0639 0.1052
sail/Sailor2-L-1B 92.3190 26.4840 22.5000 20.0000 0.1101 0.2213
sail/Sailor2-L-1B-Chat 84.7435 29.5922 40.0000 31.2500 0.4119 0.2601
sail/Sailor2-L-20B 47.7293 17.4651 68.7500 68.7500 0.0959 0.1791
sail/Sailor2-L-20B-Chat 53.6266 16.2908 92.7500 92.7500 0.1084 0.2055
sail/Sailor2-L-8B-Chat 107.9721 40.8414 87.5000 88.5000 0.0674 0.1012
sail/Sailor2-L-8B-SFT 102.5517 39.4365 38.7500 50.0000 0.0647 0.2267
SakanaAI/TinySwallow-1.5B 161.8084 93.0533 15.0000 15.0000 0.0560 0.0966
SakanaAI/TinySwallow-1.5B-Instruct 158.6293 58.3653 80.0000 75.0000 0.0552 0.0892
seanzhang/Llama3-Chinese 68.4236 32.8438 63.7500 58.7500 0.0891 0.1252
shakechen/Llama-2-7b-hf 69.0423 25.4324 5.0000 5.0000 0.0988 0.3494
Shanghai_AI_Laboratory/internlm2_5-1_8b-chat 167.5613 61.1789 48.7500 57.5000 0.0568 0.0871
Shanghai_AI_Laboratory/internlm2_5-7b 170.3296 43.9918 63.7500 63.7500 0.0802 0.1923
Shanghai_AI_Laboratory/internlm2_5-7b-chat-1m 106.7523 45.3924 81.7500 67.5000 0.1221 0.0935
Shanghai_AI_Laboratory/internlm2_5-7b-chat-1m 121.1345 49.7632 81.7500 67.5000 0.0652 0.0946
Shanghai_AI_Laboratory/internlm2-chat-1_8b 141.5918 176.2159 51.2500 63.7500 0.1848 0.1880
Shanghai_AI_Laboratory/internlm2-chat-7b-sft 93.9919 36.0204 72.5000 77.5000 0.0942 0.1404
Shanghai_AI_Laboratory/OREAL-7B-SFT 104.5528 45.9300 85.0000 84.2500 0.0790 0.1743
simplescaling/s1.1-3B 160.1005 45.8557 85.0000 88.5000 0.0722 0.1276
simplescaling/s1.1-7B 107.4360 47.8830 86.7500 87.5000 0.1230 0.1509
Skywork/Skywork-OR1-Math-7B 118.6488 38.0029 75.0000 68.7500 0.0943 0.2136
smirki/UIGEN-T1.2-REACTv19-14B 67.2781 21.9111 87.5000 89.2500 0.0690 0.2136
smirki/UIGEN-T1.2-Tailwind-14B 64.2530 22.0244 89.2500 89.2500 0.0796 0.2223
sthenno-com/miscii-14b-0218 65.5291 21.3624 87.5000 87.5000 0.0679 0.1557
sthenno-com/miscii-14b-1028 65.0002 22.0444 89.2500 91.0000 0.0638 0.1645
suayptalha/Qwen3-0.6B-Code-Expert 116.8333 45.9988 42.5000 28.7500 0.0659 0.0961
swift/llava-llama3.1-8b 82.0466 31.4898 80.0000 73.7500 0.0830 0.1581
TabbyML/Mistral-7B 74.5382 26.3365 31.2500 25.0000 0.1112 0.1364
TechxGenus-MS/starcoder2-3b-instruct 146.1460 52.1407 12.5000 10.0000 0.0783 0.0922
TeleAI/TeleChat2-3B 128.4316 61.9607 70.0000 58.7500 0.0816 0.0871
TencentARC/LLaMA-Pro-8B-Instruct 60.6969 21.8949 42.5000 33.7500 0.0690 0.1307
TIGER-Lab/MAmmoTH-7B 182.9456 82.4900 20.0000 17.5000 0.1132 0.2085
TIGER-Lab/MAmmoTH-7B-Mistral 73.3747 26.3609 21.2500 26.2500 0.1078 0.1694
TIGER-Lab/Qwen2.5-Math-7B-CFT 133.4321 49.1829 38.0000 45.0000 0.0693 0.1492
tiiuae/Falcon3-7B-Instruct 72.6287 24.2781 56.2500 50.0000 0.1004 0.1580
twinkle-ai/Llama-3.2-3B-F1-Instruct 112.9594 39.8528 67.5000 72.5000 0.0569 0.0903
unsloth/DeepSeek-R1-0528-Qwen3-8B 82.3155 32.0477 89.2500 87.5000 0.1284 0.1643
unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF 186.4471 23.7748 86.7500 87.5000 1.6414 0.9679
unsloth/DeepSeek-R1-Distill-Llama-8B 88.2270 24.7574 75.0000 80.0000 0.0898 0.2131
unsloth/DeepSeek-R1-Distill-Qwen-1.5B 135.9568 54.0245 62.5000 46.2500 0.1126 0.0984
unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF 332.8659 96.4340 31.2500 10.0000 1.4049 0.0967
unsloth/DeepSeek-R1-Distill-Qwen-14B 73.9993 21.9525 86.7500 86.7500 0.0836 0.2809
unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF 112.7677 13.1846 85.0000 85.0000 1.9081 1.2658
unsloth/DeepSeek-R1-Distill-Qwen-32B-GGUF 59.1224 5.7976 86.7500 88.5000 2.2174 2.4519
unsloth/DeepSeek-R1-Distill-Qwen-7B 137.1859 53.5137 80.0000 75.0000 0.0725 0.1017
unsloth/DeepSeek-R1-Distill-Qwen-7B-GGUF 168.7356 52.4578 82.5000 37.5000 1.2896 0.1761
unsloth/gemma-3-1b-it 39.0814 34.1862 57.5000 47.5000 0.1596 0.1261
unsloth/Llama-3___1-8B-Instruct 94.9493 25.0098 80.5000 85.0000 0.0766 0.1585
unsloth/llama-3-8b-Instruct 144.2008 54.1838 85.0000 85.0000 0.0806 0.1908
unsloth/Llama-3.2-1B-Instruct 198.4952 51.9692 28.7500 28.7500 0.0582 0.1153
unsloth/Meta-Llama-3.1-8B 100.4634 25.3314 0.0000 0.0000 0.1077 0.4192
unsloth/Meta-Llama-3.1-8B-Instruct 73.6201 23.9449 80.5000 85.0000 0.0889 0.2036
unsloth/MiniMax-M2-GGUF 203.5847 14.3542 56.7500 68.0000 1.2197 33.2379
unsloth/mistral-7b-instruct-v0.3 68.1154 27.7917 75.0000 62.5000 0.0907 0.1407
unsloth/mistral-7b-instruct-v0.3 73.8865 28.6712 75.0000 62.5000 0.0791 0.1297
unsloth/OLMo-2-0425-1B-Instruct 120.5125 46.8448 33.7500 28.7500 0.0594 0.0772
unsloth/Qwen2___5-Coder-1___5B-Instruct 129.4689 55.7459 65.0000 70.0000 0.0787 0.0961
unsloth/Qwen2___5-Coder-3B-Instruct 119.4058 43.2696 85.0000 85.0000 0.0816 0.1043
unsloth/Qwen2-0.5B-Instruct 112.5043 63.3142 48.7500 56.2500 0.0883 0.0995
unsloth/Qwen2-1.5B 187.9351 66.2223 45.0000 47.5000 0.0632 0.1375
unsloth/Qwen2-1.5B-Instruct 146.4218 50.0898 85.0000 85.0000 0.1415 0.1255
unsloth/Qwen2-7B-Instruct 108.0745 48.1845 89.2500 89.2500 0.0876 0.1104
unsloth/Qwen2-Math-7B 183.4245 65.4428 15.0000 17.5000 0.0763 0.1766
unsloth/Qwen2.5-0.5B-Instruct 145.8346 55.4418 66.2500 53.7500 0.0724 0.1059
unsloth/Qwen2.5-1.5B 112.7815 45.7474 33.7500 31.2500 0.0794 0.1296
unsloth/Qwen2.5-1.5B 117.1050 47.1035 33.7500 31.2500 0.0574 0.1521
unsloth/Qwen2.5-1.5B-Instruct 136.9541 53.7322 85.0000 86.7500 0.0750 0.1388
unsloth/Qwen2.5-14B-Instruct 72.4324 22.6274 91.0000 91.0000 0.0729 0.1496
unsloth/Qwen2.5-3B-Instruct 106.4019 41.9992 80.0000 86.7500 0.0853 0.0984
unsloth/Qwen2.5-7B 135.9966 48.0187 65.5000 70.5000 0.0668 0.1319
unsloth/Qwen2.5-7B-Instruct 117.7158 50.3232 91.0000 89.2500 0.0623 0.0968
unsloth/Qwen2.5-Coder-0.5B-Instruct 141.6803 51.2907 17.5000 20.0000 0.0713 0.1096
unsloth/Qwen2.5-Coder-1.5B-Instruct 150.6276 55.4852 65.0000 67.5000 0.0617 0.1038
unsloth/Qwen2.5-Coder-3B 144.9538 40.3566 38.7500 36.2500 0.1538 0.1682
unsloth/Qwen2.5-Math-1.5B 142.8734 54.5816 40.0000 33.7500 0.0754 0.1212
unsloth/Qwen3-0.6B-Base 154.7137 53.0776 15.0000 28.7500 0.0868 0.1487
unsloth/Qwen3-0.6B-GGUF 419.9698 83.9158 56.2500 21.2500 0.4836 0.1103
unsloth/Qwen3-1.7B-GGUF 390.4404 65.0116 80.0000 70.0000 0.5165 0.6300
unsloth/Qwen3-14B 79.1470 24.2012 86.7500 88.5000 0.1153 0.1927
unsloth/Qwen3-30B-A3B-GGUF 207.3249 15.7886 86.7500 88.5000 0.3610 19.5025
unsloth/Qwen3-32B-GGUF 80.4618 5.7764 89.2500 89.2500 0.5326 8.3319
unsloth/Qwen3-4B-GGUF 221.5664 40.5421 87.5000 85.0000 0.1766 0.8745
unsloth/Qwen3-4B-Thinking-2507 89.3274 30.2341 85.0000 77.5000 0.0761 0.4129
unsloth/Qwen3-8B 92.0549 31.0150 88.5000 72.2500 0.1577 0.1680
unsloth/QwQ-32B-Preview 55.3044 18.4581 91.7500 91.0000 0.0920 0.1792
unsloth/SmolLM2-135M-Instruct 57.8074 19.2162 10.0000 10.0000 0.2507 0.7489
unsloth/SmolLM2-360M 54.3035 21.5669 10.0000 10.0000 0.0775 0.1926
voidful/DeepSeek-R1-Distill-Llama-3.2-8B 110.3534 25.8115 82.5000 80.0000 0.0854 0.3682
voidful/Llama-3.2-3B-Instruct 174.9087 62.1691 0.0000 0.0000 0.1017 0.1970
voidful/Llama-3.2-8B-Instruct 63.0798 33.2509 85.0000 85.0000 0.0899 0.1011
WeiboAI/VibeThinker-1.5B 196.6625 71.6386 32.5000 36.2500 0.0621 0.1384
X-D-Lab/MindChat-Qwen-1_8B 161.2364 67.3120 42.5000 42.5000 0.0687 0.1149
X-D-Lab/MindChat-Qwen-7B 86.9245 29.6063 5.0000 5.0000 0.0681 0.1388
X-D-Lab/Sunsimiao-Qwen2-7B 126.3387 55.0501 85.0000 86.7500 0.2873 0.0903
XGenerationLab/XiYanSQL-QwenCoder-14B-2504 71.5332 24.8601 87.5000 76.2500 0.1039 0.1376
xiaomaohuifaguang/wangge-DeepSeek-R1-1.5B 126.1282 48.4509 50.0000 53.7500 0.0936 0.1344
Xunzillm4cc/Xunzi-Qwen1.5-7B_chat 120.4077 45.1261 62.5000 55.0000 0.0792 0.6099
Xunzillm4cc/Xunzi-Qwen2-7B 121.9344 47.2499 38.7500 28.7500 0.0967 0.1226
YOYO-AI/Qwen3-8B-YOYO-nuslerp-128K 87.2042 29.9086 85.0000 85.0000 0.1314 0.1587
YOYO-AI/Qwen3-8B-YOYO-nuslerp-plus 82.1827 29.6244 85.0000 85.0000 0.1159 0.1611
ZhipuAI/agentlm-7b 191.7156 94.6742 51.2500 51.2500 0.0822 0.0948
zhuangxialie/Phi-3-Chinese-ORPO 36.7514 11.9381 22.5000 20.0000 0.0729 0.2006
zhuangxialie/Phi-3-Chinese-ORPO 33.0563 11.4359 22.5000 23.7500 0.0899 0.2022
zpeng1989/Medical_Qwen3_17B_Large_Language_Model 122.0285 41.2365 80.0000 67.5000 0.0878 0.1376
zpeng1989/Medical_Qwen3_8B_Large_Language_Model 94.8918 32.5874 86.7500 85.7500 0.0738 0.1656
zpeng1989/Multimodel_Medical_Qwen25vl_3B_Model 52.1814 21.5373 85.0000 85.0000 0.0826 0.1623
Zyphra/ZR1-1.5B 162.5943 55.4277 67.5000 57.5000 0.0972 0.1224
Description
运行于【昇腾-910B4】系列算力卡的【文本生成】引擎,基于 vLLM 引擎进行架构特别适配优化,支持 Qwen、DeepSeek、Llama 等最新开源模型
Readme Apache-2.0 6 MiB
Languages
Python 94.3%
C++ 4.3%
Shell 1.1%
CMake 0.2%