Model: ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1 Source: Original Platform
license, library_name, pipeline_tag, base_model, datasets
| license | library_name | pipeline_tag | base_model | datasets | ||
|---|---|---|---|---|---|---|
| apache-2.0 | transformers | text-generation |
|
|
This repository contains the model presented in Reinforcement Learning for Reasoning in Large Language Models with One Training Example.
Description