GRM-1.5b/README.md

---
base_model: Qwen/Qwen2.5-1.5B-Instruct
library_name: transformers
license: apache-2.0
tags:
- reasoning
- math
- code
- general
model-index:
- name: GRM-1.5b
  results: []
pipeline_tag: text-generation
new_version: OrionLLM/GRM2-3b
---
<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/685ea8ff7b4139b6845ce395/YF0kEDYMGJhcM3Lbl2EOD.png" alt="logo" width="250">
</p>

**GRM-1.5b** is a **general-purpose reasoning-focused** 1.5B model fine-tuned to improve **multi-domain reasoning** (math, logic, coding, and broad problem-solving). It is designed to be a strong, lightweight “daily driver” for **general reasoning tasks** and as a solid base for **further fine-tuning**.

---

## Key features

- **Dedicated reasoning behavior** for general tasks (stepwise problem solving, better consistency).
- **Small & efficient (1.5B)** — practical for local inference and experimentation.
- **Multi-domain mixture**: reasoning + code + math + (some) medical reasoning data.
- **Fine-tune friendly**: intended as a good starting point for your own SFT/GRPO/DPO pipelines.

---

## Benchmarks

| Model                                                                                                        | AIME24 | AIME25 |  AMC23 | MATH500 | HMMT O2/25 | LCB 06/24-01/25 | CodeElo | CodeForces | GPQA-D | JEEBench |
| ------------------------------------------------------------------------------------------------------------ | ------ | ------ | ------ | ------- | ---------- | --------------- | ------- | ---------- | ------ | -------- |
| **[GRM-1.5b](https://huggingface.co/OrionLLM/GRM-1.5b/)**                                                   |**52.0**|**41.7**|**87.0**|   86.4  | **27.3**   |  **39.4**       |  12.9   |  15.5      |  29.5  |  51.9    |
| [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)           |  32.3  |  23.7  |  71.8  |   80.8  |   15.3     |    27.2         |  8.8    |  8.5       |  31.1  |  32.5    |
| [Nemotron-Research-Reasoning-Qwen-1.5B](https://huggingface.co/nvidia/Nemotron-Research-Reasoning-Qwen-1.5B) |**47.7**|  32.0  |**87.5**|   86.0  |   21.7     |    31.4         |**54.7** |**40.3**    |  41.8  |  52.6    |
| [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B)                                                        |**52.0**|  35.3  |  83.8  | **87.2**|   23.3     |    27.7         |  20.7   |  20.0      |**49.3**|**60.7**  |
| [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)                                  |  3.0   |  0.7   |  30.8  |   50.2  |   0.0      |    5.5          |  0.8    |   2.2      |  24.7  |  16.4    |