初始化项目,由ModelHub XC社区提供模型

Model: himanshunakrani9/TinyMathReason-1B-sft
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-30 06:33:20 +08:00
commit f77c65551f
127 changed files with 1778427 additions and 0 deletions

74
README.md Normal file
View File

@@ -0,0 +1,74 @@
---
language:
- en
license: apache-2.0
tags:
- math
- reasoning
- sft
- instruction-tuning
- llama
- pytorch
- text-generation
datasets:
- HuggingFaceFW/fineweb-edu
- open-web-math/open-web-math
- hoskinson-center/proof-pile-v2
- TIGER-Lab/MathInstruct
- meta-math/MetaMathQA
- gsm8k
metrics:
- accuracy
---
# TinyMathReason-1B-sft
TinyMathReason-1B-sft is a 1.12 Billion parameter Llama-style decoder-only transformer trained from scratch specifically for mathematical reasoning. This is the **Supervised Fine-Tuned (SFT)** variant.
## Model Description
- **Developed by:** Himanshu Nakrani
- **Model type:** Decoder-only Transformer
- **Language(s):** English, Mathematics, Code
- **License:** Apache 2.0
- **Architecture:** 22 layers, 2048 hidden dimension, 16 Attention heads, 4 KV heads (GQA), SwiGLU activation (5632 intermediate dim).
- **Parameters:** 1.12B total
- **Context Length:** 4096 tokens
## Training Details
### Pretraining (Base Model)
The base model was trained from a random initialization on Google Cloud TPU v4-32 using the [MaxText](https://github.com/google/maxtext) framework.
- **Tokens:** ~300 Billion
- **Optimizer:** AdamW (β1=0.9, β2=0.95, weight_decay=0.1)
- **Learning Rate:** 3e-4 peak, cosine decay to 3e-5
### Supervised Fine-Tuning (SFT)
This variant was trained on ~600k instruction-following mathematical examples formatted in ChatML.
- **Hardware:** 1x A100 GPU using PyTorch + TRL
- **Learning Rate:** 2e-5 (Cosine schedule)
- **Epochs:** 2
## Intended Uses & Limitations
**Intended Uses:**
- Solving step-by-step grade-school to high-school level math problems.
- Educational assistance and logic-based chain-of-thought generation.
- As a foundation for further preference optimization (e.g., DPO, GRPO).
**Limitations:**
- Being a 1B parameter model, it lacks the broad general knowledge of larger models.
- Prone to arithmetic hallucination on very large numbers.
- May fail on complex topology or advanced undergraduate mathematics.
## Citation
```bibtex
@misc{tinymathreason2026,
author = {Himanshu Nakrani},
title = {TinyMathReason-1B: A 1 Billion Parameter Mathematical Reasoning LLM Built from Scratch on TPU v4-32},
year = {2026},
publisher = {GitHub},
howpublished = {\url{https://github.com/himanshu-nakrani/TinyMathReason-1B}}
}
```