CosmoGPT2-Mini/README.md


---
language:
- en
license: mit
base_model: gpt2
tags:
- text-generation
- gpt2
- cosmopedia
- educational
- synthetic-data
model_name: CosmoGPT2-Mini
datasets:
- Dhiraj45/cosmopedia-v2
metrics:
- loss
---

# CosmoGPT2-Mini 🚀

## Description
**CosmoGPT2-Mini** is a fine-tuned version of the classic [GPT-2](https://huggingface.co/gpt2) model. It has been trained on a subset of the **Cosmopedia v2** dataset, which consists of synthetic textbooks, blog posts, and educational content. 

The goal of this model is to adapt GPT-2's capabilities to generate more informative and educational-style text compared to the base model.

## Model Details
- **Developed by:** [younes MA]
- **Model type:** Causal Language Model
- **Base Model:** GPT-2 (Small)
- **Language:** English
- **Training Precision:** `bfloat16` (optimized for stability and speed)

## Training Data
The model was trained on **30,000 samples** from the `Dhiraj45/cosmopedia-v2` dataset. This dataset is known for its high-quality synthetic data covering various academic and general knowledge topics.

## Training Hyperparameters
- **Epochs:** 1
- **Max Steps:** 1000
- **Batch Size:** 2 (with Gradient Accumulation Steps: 8)
- **Learning Rate:** 5e-5
- **Optimizer:** AdamW (fused)
- **Precision:** `bf16`
- **Max Sequence Length:** 512 tokens

## How to use
You can use this model directly with a pipeline for text generation:

```python
from transformers import pipeline

generator = pipeline("text-generation", model="iko-01//CosmoGPT2-Mini")
prompt = "The concept of gravity can be explained as"
result = generator(prompt, max_length=100, num_return_sequences=1)

print(result[0]['generated_text'])
```

## Intended Use & Limitations
- **Intended Use:** Experimental purposes, educational text generation, and studying fine-tuning on synthetic data.
- **Limitations:** Since this is a small version (GPT-2) and trained on a limited subset (30k samples), it may still generate hallucinations or repetitive text. It is not intended for production-level academic advice.

## Training Results
The model was trained on a T4 GPU (or equivalent) using optimized settings. 
- **Final Training Loss:** [2.837890]
- **Evaluation Loss:** [2.686130]

---
**Note:** This model is part of a training experiment using the Cosmopedia dataset.
```
初始化项目，由ModelHub XC社区提供模型 Model: iko-01/CosmoGPT2-Mini Source: Original Platform 2026-05-01 15:12:29 +08:00
			`---`
			`language:`
			`- en`
			`license: mit`
			`base_model: gpt2`
			`tags:`
			`- text-generation`
			`- gpt2`
			`- cosmopedia`
			`- educational`
			`- synthetic-data`
			`model_name: CosmoGPT2-Mini`
			`datasets:`
			`- Dhiraj45/cosmopedia-v2`
			`metrics:`
			`- loss`
			`---`

			`# CosmoGPT2-Mini 🚀`

			`## Description`
			`CosmoGPT2-Mini is a fine-tuned version of the classic [GPT-2](https://huggingface.co/gpt2) model. It has been trained on a subset of the Cosmopedia v2 dataset, which consists of synthetic textbooks, blog posts, and educational content.`

			`The goal of this model is to adapt GPT-2's capabilities to generate more informative and educational-style text compared to the base model.`

			`## Model Details`
			`- Developed by: [younes MA]`
			`- Model type: Causal Language Model`
			`- Base Model: GPT-2 (Small)`
			`- Language: English`
			- Training Precision: `bfloat16` (optimized for stability and speed)

			`## Training Data`
			The model was trained on 30,000 samples from the `Dhiraj45/cosmopedia-v2` dataset. This dataset is known for its high-quality synthetic data covering various academic and general knowledge topics.

			`## Training Hyperparameters`
			`- Epochs: 1`
			`- Max Steps: 1000`
			`- Batch Size: 2 (with Gradient Accumulation Steps: 8)`
			`- Learning Rate: 5e-5`
			`- Optimizer: AdamW (fused)`
			- Precision: `bf16`
			`- Max Sequence Length: 512 tokens`

			`## How to use`
			`You can use this model directly with a pipeline for text generation:`

			```python
			`from transformers import pipeline`

			`generator = pipeline("text-generation", model="iko-01//CosmoGPT2-Mini")`
			`prompt = "The concept of gravity can be explained as"`
			`result = generator(prompt, max_length=100, num_return_sequences=1)`

			`print(result[0]['generated_text'])`
			```

			`## Intended Use & Limitations`
			`- Intended Use: Experimental purposes, educational text generation, and studying fine-tuning on synthetic data.`
			`- Limitations: Since this is a small version (GPT-2) and trained on a limited subset (30k samples), it may still generate hallucinations or repetitive text. It is not intended for production-level academic advice.`

			`## Training Results`
			`The model was trained on a T4 GPU (or equivalent) using optimized settings.`
			`- Final Training Loss: [2.837890]`
			`- Evaluation Loss: [2.686130]`

			`---`
			`Note: This model is part of a training experiment using the Cosmopedia dataset.`
			```