Files
OpenR1-Qwen-7B/README.md
ModelHub XC 5955bb375c 初始化项目,由ModelHub XC社区提供模型
Model: okwinds/OpenR1-Qwen-7B
Source: Original Platform
2026-06-03 16:07:13 +08:00

76 lines
2.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
frameworks:
- Pytorch
license: Apache License 2.0
tasks:
- text-generation
---
# 本模型论文解读,请看公众号文章 👇🏻
### <img src="https://www.modelscope.cn/datasets/okwinds/Human-Like-DPO-Dataset/resolve/master/wechat.png" width="30" height="30" align="absmiddle"> 觉察流 - [Open-R1深度揭秘 DeepSeek-R1 开源复现进展](https://mp.weixin.qq.com/s/TxRaI8amE_N__1VU4XHvMg)
> <span style="color:red;font-size:16px"> 声明:本模型完全转载自 Huggingface 上的 [open-r1/OpenR1-Qwen-7B](https://huggingface.co/open-r1/OpenR1-Qwen-7B) <br/>更多模型信息,请关注下文👇🏻, 为原数据集仓库的中文版说明。</span>
<br/>
#### _仓库作者在此 👇🏻 扫一扫_
<img src="https://www.modelscope.cn/models/okwinds/GPT-2/resolve/master/qrcode_for_jcl_258.jpg" />
# 下载方式
### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重,可浏览“模型文件”页面获取。
#### 您可以通过如下git clone命令或者ModelScope SDK来下载模型
SDK下载
```bash
#安装ModelScope
pip install modelscope
```
```python
#SDK模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('okwinds/OpenR1-Qwen-7B')
```
Git下载
```
#Git模型下载
git clone https://www.modelscope.cn/okwinds/OpenR1-Qwen-7B.git
```
# 模型介绍
# OpenR1-Qwen-7B
This is a finetune of [Qwen2.5-Math-Instruct](https://www.modelscope.cn/models/Qwen/Qwen2.5-Math-7B-Instruct) on [okwinds/OpenR1-Math-220k](https://www.modelscope.cn/datasets/okwinds/OpenR1-Math-220k) (`default` split).
## Quick start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "open-r1/OpenR1-Qwen-7B"
device = "cuda"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Find the value of $x$ that satisfies the equation $4x+5 = 6x+7$."
messages = [
{"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
{"role": "user", "content": prompt}
]
```
## Training
We train the model on the `default` split of [okwinds/OpenR1-Math-220k](https://www.modelscope.cn/datasets/okwinds/OpenR1-Math-220k) for 3 epochs. We use learning rate of 5e-5 and extend the context length from 4k to 32k, by increasing RoPE frequency to 300k. The training follows a linear learning rate schedule with a 10% warmup phase. The table below compares the performance of OpenR1-Qwen-7B to DeepSeek-Distill-Qwen-7B and OpenThinker-7B using [lighteval](https://github.com/huggingface/open-r1/tree/main?tab=readme-ov-file#evaluating-models).
You can find the training and evaluation code at: https://github.com/huggingface/open-r1/
| Model | MATH-500 | AIME24 | AIME25 |
| --- | --- | --- |--- |
| DeepSeek-Distill-Qwen-7B | 91.6 | 43.3 | 40.0|
| OpenR1-Qwen-7B | 90.6 | 36.7 | 40.0 |
| OpenThinker-7B | 89.6 | 30.0 | 33.3 |