109 lines
5.3 KiB
Markdown
109 lines
5.3 KiB
Markdown
|
|
---
|
|||
|
|
frameworks:
|
|||
|
|
- Pytorch
|
|||
|
|
license: Apache License 2.0
|
|||
|
|
tasks:
|
|||
|
|
- text-generation
|
|||
|
|
base_model:
|
|||
|
|
- okwinds/Miromind-M1-SFT-7B
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
本模型转载自 huggingface 【[miromind-ai](https://huggingface.co/miromind-ai)】
|
|||
|
|
|
|||
|
|
#### 📖 关于项目相关的研究,可阅读公众号“觉察流”文章👇</br>
|
|||
|
|
|
|||
|
|
《[MiroMind-M1:如何用CAMPO算法打造高效且可复现的全栈开源推理模型](https://mp.weixin.qq.com/s/REPzzgsUjDMikg4jIo9KRg)》
|
|||
|
|
|
|||
|
|
#### _本仓库作者在此 👇🏻 扫一扫_
|
|||
|
|
|
|||
|
|
<img src="https://www.modelscope.cn/models/okwinds/GPT-2/resolve/master/qrcode_for_jcl_258.jpg" />
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
SDK下载
|
|||
|
|
```bash
|
|||
|
|
#安装ModelScope
|
|||
|
|
pip install modelscope
|
|||
|
|
```
|
|||
|
|
```python
|
|||
|
|
#SDK模型下载
|
|||
|
|
from modelscope import snapshot_download
|
|||
|
|
model_dir = snapshot_download('okwinds/MiroMind-M1-RL-7B')
|
|||
|
|
```
|
|||
|
|
Git下载
|
|||
|
|
```
|
|||
|
|
#Git模型下载
|
|||
|
|
git clone https://www.modelscope.cn/okwinds/MiroMind-M1-RL-7B.git
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
# 官方 MiroMind-M1-RL-7B 简介
|
|||
|
|
|
|||
|
|
<div align="center">
|
|||
|
|
<img src="https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B/resolve/master/assets/MiromindAI_H.svg" width="50%" alt="MiroMindM1" />
|
|||
|
|
</div>
|
|||
|
|
<!-- <hr> -->
|
|||
|
|
<div align="center">
|
|||
|
|
|
|||
|
|
[](https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B)
|
|||
|
|
[](https://www.modelscope.cn/datasets/okwinds/MiroMind-M1-RL-62K)
|
|||
|
|
[](https://arxiv.org/abs/2507.14683)
|
|||
|
|
[](https://github.com/MiroMindAsia/MiroMind-M1)
|
|||
|
|
[](https://miromind.ai/)
|
|||
|
|
|
|||
|
|
</div>
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
# MiroMind-M1
|
|||
|
|
|
|||
|
|
|
|||
|
|
## 🧾 Overview
|
|||
|
|
<div align="center">
|
|||
|
|
<img src="https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B/resolve/master/assets/7b_performance_training.png" width="80%" alt="7B Model Training Performance" />
|
|||
|
|
<p><i>Training performance of MiroMind-M1-RL-7B on AIME24 and AIME25.</i></p>
|
|||
|
|
</div>
|
|||
|
|
|
|||
|
|
**MiroMind-M1** is a fully open-source series of reasoning language models built on `Qwen-2.5`, focused on advancing mathematical reasoning. It is trained through supervised fine-tuning (**SFT**) on 719K curated problems and reinforcement learning with verifiable rewards (**RLVR**) on 62K challenging examples, using a context-aware multi-stage policy optimization method (**CAMPO**). MiroMind-M1 achieves state-of-the-art performance among open-source 7B Qwen-2.5-based models on AIME24, AIME25, and MATH500, with all models (`MiroMind-M1-SFT-7B`, `MiroMind-M1-RL-7B`, `MiroMind-M1-RL-32B`), data (`MiroMind-M1-SFT-719K`, `MiroMind-M1-RL-62K`), and training setups openly released.
|
|||
|
|
|
|||
|
|
|
|||
|
|
## 📊 Evaluation
|
|||
|
|
|
|||
|
|
### MiroMind-M1-SFT
|
|||
|
|
| Model | Initial Checkpoint | AIME24 (avg@64) | AIME25 (avg@64) | MATH500 (avg@5) |
|
|||
|
|
|------------------|----------------------------|--------|--------|---------|
|
|||
|
|
| DeepSeek-R1-Distill | Qwen2.5-Math-7B | 55.5 | 40.4† | 92.8 |
|
|||
|
|
| OpenThoughts | Qwen2.5-7-Instruct | 31.3 | 23.3 | 83.2 |
|
|||
|
|
| Open-R1 | Qwen2.5-Math-7B-Instruct | 36.7 | 40.0 | 90.6 |
|
|||
|
|
| Synthetic-1 | Qwen2.5-7B-Instruct | 30.0 | 26.6 | 85.6 |
|
|||
|
|
| **MiroMind-SFT-7B** | Qwen2.5-Math-7B | 60.4 | 45.0 | 94.6 |
|
|||
|
|
|
|||
|
|
*† means that the score of DeepSeek-R1 on AIME25 is from our evaluation.*
|
|||
|
|
|
|||
|
|
### MiroMind-M1-RL
|
|||
|
|
| Model | AIME24 (avg@64) | AIME25 (avg@64) | MATH500 (avg@5) |
|
|||
|
|
|----------------------------------|--------|--------|---------|
|
|||
|
|
| DeepSeek-R1 | 79.8 | 70.0 | – |
|
|||
|
|
| DeepSeek-R1-0528 | 91.4 | 87.5 | – |
|
|||
|
|
| Qwen3-8B | 76.0 | 67.3 | – |
|
|||
|
|
| DeepSeek-R1-0528-Qwen3-8B | 86.0 | 76.3 | – |
|
|||
|
|
| <tr><td colspan="4" align="center"><em>**32B Models trained from Qwen2.5 series**</em></td></tr> |
|
|||
|
|
| DeepSeek-R1-Distill-Qwen-32B | 70.8 | 52.1 | 95.8 |
|
|||
|
|
| Skywork-OR1-32B-Preview | 77.1 | 68.2 | 97.5 |
|
|||
|
|
| **MiroMind-M1-RL-32B** | 77.5 | 65.6 | 96.4 |
|
|||
|
|
| <tr><td colspan="4" align="center"><em>**7B Models trained from Qwen2.5 series**</em></td></tr> |
|
|||
|
|
| DeepSeek-R1-Distill-Qwen-7B | 55.5 | 39.2 | – |
|
|||
|
|
| **MiroMind-M1-SFT-7B** | 60.4 | 45.0 | 94.6 |
|
|||
|
|
| Light-R1-7B-DS | 59.1 | 44.3 | – |
|
|||
|
|
| Skywork-OR1-7B | 72.2 | 54.6 | – |
|
|||
|
|
| **MiroMind-M1-RL-7B** | 73.4 | 57.8 | 96.7 |
|
|||
|
|
|
|||
|
|
|
|||
|
|
## 🔗 Resources
|
|||
|
|
### Models
|
|||
|
|
[`MiroMind-M1-SFT-7B`](https://www.modelscope.cn/models/okwinds/MiroMind-M1-SFT-7B)<br>
|
|||
|
|
[`MiroMind-M1-RL-7B`](https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B)<br>
|
|||
|
|
[`MiroMind-M1-RL-32B`](https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-32B)<br>
|
|||
|
|
|
|||
|
|
### Data
|
|||
|
|
[`MiroMind-M1-SFT-719K`](https://www.modelscope.cn/datasets/okwinds/MiroMind-M1-SFT-719K)<br>
|
|||
|
|
[`MiroMind-M1-RL-62K`](https://www.modelscope.cn/datasets/okwinds/MiroMind-M1-RL-62K)<br>
|