初始化项目,由ModelHub XC社区提供模型
Model: himanshunakrani9/TinyMathReason-1B-sft Source: Original Platform
This commit is contained in:
74
README.md
Normal file
74
README.md
Normal file
@@ -0,0 +1,74 @@
|
||||
---
|
||||
language:
|
||||
- en
|
||||
license: apache-2.0
|
||||
tags:
|
||||
- math
|
||||
- reasoning
|
||||
- sft
|
||||
- instruction-tuning
|
||||
- llama
|
||||
- pytorch
|
||||
- text-generation
|
||||
datasets:
|
||||
- HuggingFaceFW/fineweb-edu
|
||||
- open-web-math/open-web-math
|
||||
- hoskinson-center/proof-pile-v2
|
||||
- TIGER-Lab/MathInstruct
|
||||
- meta-math/MetaMathQA
|
||||
- gsm8k
|
||||
metrics:
|
||||
- accuracy
|
||||
---
|
||||
|
||||
# TinyMathReason-1B-sft
|
||||
|
||||
TinyMathReason-1B-sft is a 1.12 Billion parameter Llama-style decoder-only transformer trained from scratch specifically for mathematical reasoning. This is the **Supervised Fine-Tuned (SFT)** variant.
|
||||
|
||||
## Model Description
|
||||
|
||||
- **Developed by:** Himanshu Nakrani
|
||||
- **Model type:** Decoder-only Transformer
|
||||
- **Language(s):** English, Mathematics, Code
|
||||
- **License:** Apache 2.0
|
||||
- **Architecture:** 22 layers, 2048 hidden dimension, 16 Attention heads, 4 KV heads (GQA), SwiGLU activation (5632 intermediate dim).
|
||||
- **Parameters:** 1.12B total
|
||||
- **Context Length:** 4096 tokens
|
||||
|
||||
## Training Details
|
||||
|
||||
### Pretraining (Base Model)
|
||||
The base model was trained from a random initialization on Google Cloud TPU v4-32 using the [MaxText](https://github.com/google/maxtext) framework.
|
||||
- **Tokens:** ~300 Billion
|
||||
- **Optimizer:** AdamW (β1=0.9, β2=0.95, weight_decay=0.1)
|
||||
- **Learning Rate:** 3e-4 peak, cosine decay to 3e-5
|
||||
|
||||
### Supervised Fine-Tuning (SFT)
|
||||
This variant was trained on ~600k instruction-following mathematical examples formatted in ChatML.
|
||||
- **Hardware:** 1x A100 GPU using PyTorch + TRL
|
||||
- **Learning Rate:** 2e-5 (Cosine schedule)
|
||||
- **Epochs:** 2
|
||||
|
||||
## Intended Uses & Limitations
|
||||
|
||||
**Intended Uses:**
|
||||
- Solving step-by-step grade-school to high-school level math problems.
|
||||
- Educational assistance and logic-based chain-of-thought generation.
|
||||
- As a foundation for further preference optimization (e.g., DPO, GRPO).
|
||||
|
||||
**Limitations:**
|
||||
- Being a 1B parameter model, it lacks the broad general knowledge of larger models.
|
||||
- Prone to arithmetic hallucination on very large numbers.
|
||||
- May fail on complex topology or advanced undergraduate mathematics.
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@misc{tinymathreason2026,
|
||||
author = {Himanshu Nakrani},
|
||||
title = {TinyMathReason-1B: A 1 Billion Parameter Mathematical Reasoning LLM Built from Scratch on TPU v4-32},
|
||||
year = {2026},
|
||||
publisher = {GitHub},
|
||||
howpublished = {\url{https://github.com/himanshu-nakrani/TinyMathReason-1B}}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user