GPT-2.5-Math/README.md

---
language: en
license: mit
base_model: BikoRiko/GPT-2.4-High-Pro
tags:
- gpt2
- math
- fine-tuned
- Pro
- Math
pipeline_tag: text-generation
---

# GPT-2.5-Math

GPT-2.5-Math is an upgraded version of **BikoRiko/GPT-2.4-High-Pro**, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.

## Model Details
- **Architecture:** GPT-2 with 6 additional layers (Total parameters ~0.2B).
- **Training Hardware:** NVIDIA H100 (via Modal.com).
- **Dataset:** 5% subset of `microsoft/orca-math-word-problems-200k`.
- **Objective:** Fine-tuned to solve math word problems and logical queries.

## Performance
The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.

## Training Details
- **Optimizer:** AdamW
- **Precision:** Mixed Precision (torch.amp)
- **Epochs:** 3
- **Learning Rate:** 5e-5
初始化项目，由ModelHub XC社区提供模型 Model: BikoRiko/GPT-2.5-Math Source: Original Platform 2026-06-11 22:53:12 +08:00			`---`
			`language: en`
			`license: mit`
			`base_model: BikoRiko/GPT-2.4-High-Pro`
			`tags:`
			`- gpt2`
			`- math`
			`- fine-tuned`
			`- Pro`
			`- Math`
			`pipeline_tag: text-generation`
			`---`

			`# GPT-2.5-Math`

			`GPT-2.5-Math is an upgraded version of BikoRiko/GPT-2.4-High-Pro, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.`

			`## Model Details`
			`- Architecture: GPT-2 with 6 additional layers (Total parameters ~0.2B).`
			`- Training Hardware: NVIDIA H100 (via Modal.com).`
			- Dataset: 5% subset of `microsoft/orca-math-word-problems-200k`.
			`- Objective: Fine-tuned to solve math word problems and logical queries.`

			`## Performance`
			`The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.`

			`## Training Details`
			`- Optimizer: AdamW`
			`- Precision: Mixed Precision (torch.amp)`
			`- Epochs: 3`
			`- Learning Rate: 5e-5`