初始化项目,由ModelHub XC社区提供模型
Model: BikoRiko/GPT-2.5-Math Source: Original Platform
This commit is contained in:
31
README.md
Normal file
31
README.md
Normal file
@@ -0,0 +1,31 @@
|
||||
---
|
||||
language: en
|
||||
license: mit
|
||||
base_model: BikoRiko/GPT-2.4-High-Pro
|
||||
tags:
|
||||
- gpt2
|
||||
- math
|
||||
- fine-tuned
|
||||
- Pro
|
||||
- Math
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# GPT-2.5-Math
|
||||
|
||||
GPT-2.5-Math is an upgraded version of **BikoRiko/GPT-2.4-High-Pro**, featuring an expanded architecture and specialized fine-tuning on mathematical reasoning.
|
||||
|
||||
## Model Details
|
||||
- **Architecture:** GPT-2 with 6 additional layers (Total parameters ~0.2B).
|
||||
- **Training Hardware:** NVIDIA H100 (via Modal.com).
|
||||
- **Dataset:** 5% subset of `microsoft/orca-math-word-problems-200k`.
|
||||
- **Objective:** Fine-tuned to solve math word problems and logical queries.
|
||||
|
||||
## Performance
|
||||
The model is trained for mathematical reasoning. While it is a 0.2B parameter model, it demonstrates the beginning of logical grounding for basic word problems.
|
||||
|
||||
## Training Details
|
||||
- **Optimizer:** AdamW
|
||||
- **Precision:** Mixed Precision (torch.amp)
|
||||
- **Epochs:** 3
|
||||
- **Learning Rate:** 5e-5
|
||||
Reference in New Issue
Block a user