7b2cbf9f5ac0ec563d152fe1246bbe7b21a5c0c5
Model: BikoRiko/GPT-2.5-Math Source: Original Platform
language, license, base_model, tags, pipeline_tag
| language | license | base_model | tags | pipeline_tag | |||||
|---|---|---|---|---|---|---|---|---|---|
| en | mit | BikoRiko/GPT-2.4-High-Pro |
|
text-generation |
GPT-2.5-Math
GPT-2.5-Math is an upgraded version of BikoRiko/GPT-2.4-High-Pro, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.
Model Details
- Architecture: GPT-2 with 6 additional layers (Total parameters ~0.2B).
- Training Hardware: NVIDIA H100 (via Modal.com).
- Dataset: 5% subset of
microsoft/orca-math-word-problems-200k. - Objective: Fine-tuned to solve math word problems and logical queries.
Performance
The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.
Training Details
- Optimizer: AdamW
- Precision: Mixed Precision (torch.amp)
- Epochs: 3
- Learning Rate: 5e-5
Description