初始化项目,由ModelHub XC社区提供模型
Model: OrionLLM/GRM-7b Source: Original Platform
This commit is contained in:
42
README.md
Normal file
42
README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
base_model: Qwen/Qwen2.5-7B-Instruct
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
tags:
|
||||
- reasoning
|
||||
- math
|
||||
- code
|
||||
- general
|
||||
model-index:
|
||||
- name: GRM-7b
|
||||
results: []
|
||||
pipeline_tag: text-generation
|
||||
new_version: OrionLLM/GRM2-3b
|
||||
---
|
||||
<p align="center">
|
||||
<img src="https://cdn-uploads.huggingface.co/production/uploads/685ea8ff7b4139b6845ce395/YF0kEDYMGJhcM3Lbl2EOD.png" alt="logo" width="250">
|
||||
</p>
|
||||
|
||||
**GRM-7b** is a **general-purpose reasoning-focused** 7B model fine-tuned to improve **multi-domain reasoning** (math, logic, coding, and broad problem-solving). It is designed to be a strong, practical “daily driver” for **general reasoning tasks** and as a solid base for **further fine-tuning**.
|
||||
|
||||
---
|
||||
|
||||
## Key features
|
||||
|
||||
- **Dedicated reasoning behavior** for general tasks (stepwise problem solving, better consistency).
|
||||
- **Strong 7B-scale model** — practical for local inference and experimentation.
|
||||
- **Multi-domain mixture**: reasoning + code + math + (some) medical reasoning data.
|
||||
- **Fine-tune friendly**: intended as a good starting point for your own SFT/GRPO/DPO pipelines.
|
||||
|
||||
---
|
||||
|
||||
## Benchmarks
|
||||
|
||||
| Model | Data | AIME24 | AIME25 | AMC23 | MATH500 | HMMT O2/25 | LCB 06/24-01/25 | CodeElo | CodeForces | GPQA-D | JEEBench |
|
||||
| ----------------------------------------------------------------------------------------------- | ----- | ------ | ------ | ------ | ------- | ---------- | --------------- | ------- | ---------- | ------ | -------- |
|
||||
| [OpenThinker-7B](https://huggingface.co/open-thoughts/OpenThinker-7B) | ✅ | 30.7 | 22.0 | 72.5 | 82.8 | 15.7 | 26.1 | 11.1 | 14.9 | 38.6 | 45.3 |
|
||||
| **[GRM-7b](https://huggingface.co/OrionLLM/GRM-7b)** | ✅ |**69.0**|**53.3**|**93.5**| **90.0**| **42.7** | **51.7** | 31.0 |**32.2** | 53.7 |**72.4** |
|
||||
| [DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) | ❌ | 51.3 | 38.0 | 92.0 | 88.0 | 25.0 | 34.5 | 19.9 | 21.1 | 33.2 | 50.4 |
|
||||
| [OpenR1-Distill-7B](https://huggingface.co/open-r1/OpenR1-Distill-7B) | ✅ | 57.7 | 39.7 | 87.0 | 88.0 | 25.7 | 30.7 | 30.1 | 29.3 |**58.9**| 68.7 |
|
||||
| [Llama-3.1-Nemotron-Nano-8B-v1](https://huggingface.co/nvidia/Llama-3.1-Nemotron-Nano-8B-v1) | ✅ | 62.0 | 48.0 |**94.0**| 89.4 | 26.7 | **50.9** | 30.9 |**32.9** | 52.9 | 70.7 |
|
||||
| [AceReason-Nemotron-7B](https://huggingface.co/nvidia/AceReason-Nemotron-7B) | ✅ |**71.0**| 50.7 |**93.8**| 89.8 | 33.3 | 44.3 |**32.9** |**30.9** | 52.9 | 64.3 |
|
||||
Reference in New Issue
Block a user