Files
MiroMind-M1-RL-7B/README.md
ModelHub XC 93204eba2b 初始化项目,由ModelHub XC社区提供模型
Model: okwinds/MiroMind-M1-RL-7B
Source: Original Platform
2026-06-23 19:20:12 +08:00

109 lines
5.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
frameworks:
- Pytorch
license: Apache License 2.0
tasks:
- text-generation
base_model:
- okwinds/Miromind-M1-SFT-7B
---
本模型转载自 huggingface 【[miromind-ai](https://huggingface.co/miromind-ai)】
#### 📖 关于项目相关的研究,可阅读公众号“觉察流”文章👇</br>
《[MiroMind-M1如何用CAMPO算法打造高效且可复现的全栈开源推理模型](https://mp.weixin.qq.com/s/REPzzgsUjDMikg4jIo9KRg)》
#### _本仓库作者在此 👇🏻 扫一扫_
<img src="https://www.modelscope.cn/models/okwinds/GPT-2/resolve/master/qrcode_for_jcl_258.jpg" />
---
SDK下载
```bash
#安装ModelScope
pip install modelscope
```
```python
#SDK模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('okwinds/MiroMind-M1-RL-7B')
```
Git下载
```
#Git模型下载
git clone https://www.modelscope.cn/okwinds/MiroMind-M1-RL-7B.git
```
# 官方 MiroMind-M1-RL-7B 简介
<div align="center">
<img src="https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B/resolve/master/assets/MiromindAI_H.svg" width="50%" alt="MiroMindM1" />
</div>
<!-- <hr> -->
<div align="center">
[![Models](https://img.shields.io/badge/Models-5EDDD2?style=for-the-badge&logo=huggingface&logoColor=ffffff&labelColor)](https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B)
[![Data](https://img.shields.io/badge/Data-0040A1?style=for-the-badge&logo=huggingface&logoColor=ffffff&labelColor)](https://www.modelscope.cn/datasets/okwinds/MiroMind-M1-RL-62K)
[![Paper](https://img.shields.io/badge/Paper-000000?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2507.14683)
[![Github](https://img.shields.io/badge/Code-000000?style=for-the-badge&logo=github&logoColor=white)](https://github.com/MiroMindAsia/MiroMind-M1)
[![Website](https://img.shields.io/badge/Website-000000?style=for-the-badge&logo=google-chrome&logoColor=white)](https://miromind.ai/)
</div>
# MiroMind-M1
## 🧾 Overview
<div align="center">
<img src="https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B/resolve/master/assets/7b_performance_training.png" width="80%" alt="7B Model Training Performance" />
<p><i>Training performance of MiroMind-M1-RL-7B on AIME24 and AIME25.</i></p>
</div>
**MiroMind-M1** is a fully open-source series of reasoning language models built on `Qwen-2.5`, focused on advancing mathematical reasoning. It is trained through supervised fine-tuning (**SFT**) on 719K curated problems and reinforcement learning with verifiable rewards (**RLVR**) on 62K challenging examples, using a context-aware multi-stage policy optimization method (**CAMPO**). MiroMind-M1 achieves state-of-the-art performance among open-source 7B Qwen-2.5-based models on AIME24, AIME25, and MATH500, with all models (`MiroMind-M1-SFT-7B`, `MiroMind-M1-RL-7B`, `MiroMind-M1-RL-32B`), data (`MiroMind-M1-SFT-719K`, `MiroMind-M1-RL-62K`), and training setups openly released.
## 📊 Evaluation
### MiroMind-M1-SFT
| Model | Initial Checkpoint | AIME24 (avg@64) | AIME25 (avg@64) | MATH500 (avg@5) |
|------------------|----------------------------|--------|--------|---------|
| DeepSeek-R1-Distill | Qwen2.5-Math-7B | 55.5 | 40.4† | 92.8 |
| OpenThoughts | Qwen2.5-7-Instruct | 31.3 | 23.3 | 83.2 |
| Open-R1 | Qwen2.5-Math-7B-Instruct | 36.7 | 40.0 | 90.6 |
| Synthetic-1 | Qwen2.5-7B-Instruct | 30.0 | 26.6 | 85.6 |
| **MiroMind-SFT-7B** | Qwen2.5-Math-7B | 60.4 | 45.0 | 94.6 |
*† means that the score of DeepSeek-R1 on AIME25 is from our evaluation.*
### MiroMind-M1-RL
| Model | AIME24 (avg@64) | AIME25 (avg@64) | MATH500 (avg@5) |
|----------------------------------|--------|--------|---------|
| DeepSeek-R1 | 79.8 | 70.0 | |
| DeepSeek-R1-0528 | 91.4 | 87.5 | |
| Qwen3-8B | 76.0 | 67.3 | |
| DeepSeek-R1-0528-Qwen3-8B | 86.0 | 76.3 | |
| <tr><td colspan="4" align="center"><em>**32B Models trained from Qwen2.5 series**</em></td></tr> |
| DeepSeek-R1-Distill-Qwen-32B | 70.8 | 52.1 | 95.8 |
| Skywork-OR1-32B-Preview | 77.1 | 68.2 | 97.5 |
| **MiroMind-M1-RL-32B** | 77.5 | 65.6 | 96.4 |
| <tr><td colspan="4" align="center"><em>**7B Models trained from Qwen2.5 series**</em></td></tr> |
| DeepSeek-R1-Distill-Qwen-7B | 55.5 | 39.2 | |
| **MiroMind-M1-SFT-7B** | 60.4 | 45.0 | 94.6 |
| Light-R1-7B-DS | 59.1 | 44.3 | |
| Skywork-OR1-7B | 72.2 | 54.6 | |
| **MiroMind-M1-RL-7B** | 73.4 | 57.8 | 96.7 |
## 🔗 Resources
### Models
[`MiroMind-M1-SFT-7B`](https://www.modelscope.cn/models/okwinds/MiroMind-M1-SFT-7B)<br>
[`MiroMind-M1-RL-7B`](https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-7B)<br>
[`MiroMind-M1-RL-32B`](https://www.modelscope.cn/models/okwinds/MiroMind-M1-RL-32B)<br>
### Data
[`MiroMind-M1-SFT-719K`](https://www.modelscope.cn/datasets/okwinds/MiroMind-M1-SFT-719K)<br>
[`MiroMind-M1-RL-62K`](https://www.modelscope.cn/datasets/okwinds/MiroMind-M1-RL-62K)<br>