Compare commits
2 Commits
32d5426ac6
...
docs-readm
| Author | SHA1 | Date | |
|---|---|---|---|
| 3434d628cd | |||
|
|
dce5f2ed53 |
@@ -152,7 +152,7 @@ curl -X POST http://localhost:10086/v1/chat/completions \
|
||||
| nanonets/Nanonets-OCR-s | 6.5227 | 5.1291 | |
|
||||
| nanonets/Nanonets-OCR2-1.5B-exp | 0.4982 | 0.3910 | |
|
||||
| nanonets/Nanonets-OCR2-3B | 1.5362 | 1.4019 | |
|
||||
| OpenBMB/MiniCPM-o-2_6 | 6.8743 | 3.6506 | |
|
||||
| OpenBMB/MiniCPM-o-2_6 | 6.8743 | 3.6506 | 需要安装torchaudio |
|
||||
| OpenBMB/MiniCPM-V-4 | 13.7100 | 3.7743 | |
|
||||
| OpenBMB/MiniCPM-V-4_5 | 31.9896 | 3.4504 | |
|
||||
| OpenDataLab/MinerU2.5-2509-1.2B | 1.5679 | 1.1599 | |
|
||||
@@ -256,7 +256,7 @@ curl -X POST http://localhost:10086/v1/chat/completions \
|
||||
|
||||
### 大语言模型
|
||||
|
||||
| 模型名称 | A100出字速度 | 昇腾-910B出字速度 | A100输出质量 | 输出质量 | A100首字延迟(秒) | 首字延迟(秒) | 备注 |
|
||||
| 模型名称 | A100出字速度 | 昇腾-910B4出字速度 | A100输出质量 | 昇腾-910B4输出质量 | A100首字延迟(秒) | 昇腾-910B4首字延迟(秒) | 备注 |
|
||||
| ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- |
|
||||
| zpeng1989/Medical_DeepSeek_Large_Language_Model | 69.6809 | 20.4259 | 80.0000 | 67.5000 | 0.0778 | 0.2209 | |
|
||||
| 01ai/Yi-1.5-9B-32K | 108.2437 | 37.2895 | 22.5000 | 22.5000 | 0.0863 | 0.1484 | |
|
||||
@@ -470,7 +470,7 @@ curl -X POST http://localhost:10086/v1/chat/completions \
|
||||
| Qwen/CodeQwen1___5-7B-Chat-GGUF | 240.1317 | 32.7051 | 30.0000 | 33.7500 | 0.0776 | 0.2257 | |
|
||||
| Qwen/CodeQwen1.5-7B | 129.9110 | 50.0198 | 15.0000 | 15.0000 | 0.0779 | 0.1638 | |
|
||||
| Qwen/CodeQwen1.5-7B-Chat | 144.8556 | 47.2296 | 36.2500 | 36.2500 | 0.0631 | 0.1137 | |
|
||||
| Qwen/Qwen-72B-Chat | 51.3699 | 46.9181 | 85.0000 | 75.0000 | 0.1596 | 0.2037 | |
|
||||
| Qwen/Qwen-72B-Chat | 51.3699 | 46.9181 | 85.0000 | 75.0000 | 0.1596 | 0.2037 | 需要提供额外的 [chat_template.jinja](chat_template.jinja) |
|
||||
| Qwen/Qwen-VL | 144.3932 | 41.3327 | 20.0000 | 20.0000 | 0.2175 | 0.2128 | |
|
||||
| Qwen/Qwen-VL-Chat | 137.3423 | 44.7255 | 70.0000 | 61.2500 | 0.1737 | 0.1550 | |
|
||||
| Qwen/Qwen/Qwen2-7B-Instruct-GGUF | 142.2627 | 54.6809 | 91.0000 | 70.0000 | 0.9237 | 0.1415 | |
|
||||
@@ -486,7 +486,7 @@ curl -X POST http://localhost:10086/v1/chat/completions \
|
||||
| Qwen/Qwen1.5-7B-Chat-GGUF | 201.2363 | 22.2877 | 88.5000 | 88.5000 | 0.9059 | 1.0335 | |
|
||||
| Qwen/Qwen1.5-MoE-A2.7B | 54.4072 | 34.3188 | 75.0000 | 61.2500 | 0.2530 | 0.1599 | |
|
||||
| Qwen/Qwen2-0.5B | 180.6460 | 57.7134 | 20.0000 | 15.0000 | 0.0912 | 0.0853 | |
|
||||
| Qwen/Qwen2-72B | 46.9816 | 14.2144 | 85.0000 | 85.0000 | 0.2359 | 0.3053 | |
|
||||
| Qwen/Qwen2-72B | 46.9816 | 14.2144 | 85.0000 | 85.0000 | 0.2359 | 0.3053 | 需要提供额外的 [chat_template.jinja](chat_template.jinja) |
|
||||
| Qwen/Qwen2-Audio-7B-Instruct | 113.3340 | 48.6288 | 75.0000 | 85.0000 | 0.2226 | 0.2779 | |
|
||||
| Qwen/Qwen2-Math-7B-Instruct | 151.8197 | 59.6123 | 61.2500 | 74.2500 | 0.0655 | 0.0971 | |
|
||||
| Qwen/Qwen2.5-0.5B-Instruct | 129.0884 | 161.6113 | 66.2500 | 66.2500 | 0.2191 | 0.3365 | |
|
||||
|
||||
Reference in New Issue
Block a user