Files

ModelHub XC 5955bb375c 初始化项目，由ModelHub XC社区提供模型

Model: okwinds/OpenR1-Qwen-7B
Source: Original Platform

2026-06-03 16:07:13 +08:00

2.9 KiB

Raw Permalink Blame History

frameworks, license, tasks

frameworks

license

tasks

Pytorch

Apache License 2.0

text-generation

本模型论文解读，请看公众号文章 👇🏻

觉察流 - Open-R1：深度揭秘 DeepSeek-R1 开源复现进展

声明：本模型完全转载自 Huggingface 上的 open-r1/OpenR1-Qwen-7B
更多模型信息，请关注下文👇🏻，为原数据集仓库的中文版说明。

仓库作者在此 👇🏻 扫一扫

下载方式

当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重，可浏览“模型文件”页面获取。

您可以通过如下git clone命令，或者ModelScope SDK来下载模型

SDK下载

#安装ModelScope
pip install modelscope

#SDK模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('okwinds/OpenR1-Qwen-7B')

Git下载

#Git模型下载
git clone https://www.modelscope.cn/okwinds/OpenR1-Qwen-7B.git

模型介绍

OpenR1-Qwen-7B

This is a finetune of Qwen2.5-Math-Instruct on okwinds/OpenR1-Math-220k (default split).

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "open-r1/OpenR1-Qwen-7B"
device = "cuda" 
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Find the value of $x$ that satisfies the equation $4x+5 = 6x+7$."
messages = [
    {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
    {"role": "user", "content": prompt}
]

Training

We train the model on the default split of okwinds/OpenR1-Math-220k for 3 epochs. We use learning rate of 5e-5 and extend the context length from 4k to 32k, by increasing RoPE frequency to 300k. The training follows a linear learning rate schedule with a 10% warmup phase. The table below compares the performance of OpenR1-Qwen-7B to DeepSeek-Distill-Qwen-7B and OpenThinker-7B using lighteval.

You can find the training and evaluation code at: https://github.com/huggingface/open-r1/

Model	MATH-500	AIME24	AIME25
DeepSeek-Distill-Qwen-7B	91.6	43.3	40.0
OpenR1-Qwen-7B	90.6	36.7	40.0
OpenThinker-7B	89.6	30.0	33.3

2.9 KiB Raw Permalink Blame History Unescape Escape