初始化项目，由ModelHub XC社区提供模型

Model: iko-01/CosmoGPT2-Mini Source: Original Platform
2026-05-01 15:12:29 +08:00
commit c5cad29193
12 changed files with 250619 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,70 @@
+
+---
+language:
+- en
+license: mit
+base_model: gpt2
+tags:
+- text-generation
+- gpt2
+- cosmopedia
+- educational
+- synthetic-data
+model_name: CosmoGPT2-Mini
+datasets:
+- Dhiraj45/cosmopedia-v2
+metrics:
+- loss
+---
+
+# CosmoGPT2-Mini 🚀
+
+## Description
+**CosmoGPT2-Mini** is a fine-tuned version of the classic [GPT-2](https://huggingface.co/gpt2) model. It has been trained on a subset of the **Cosmopedia v2** dataset, which consists of synthetic textbooks, blog posts, and educational content. 
+
+The goal of this model is to adapt GPT-2's capabilities to generate more informative and educational-style text compared to the base model.
+
+## Model Details
+- **Developed by:** [younes MA]
+- **Model type:** Causal Language Model
+- **Base Model:** GPT-2 (Small)
+- **Language:** English
+- **Training Precision:** `bfloat16` (optimized for stability and speed)
+
+## Training Data
+The model was trained on **30,000 samples** from the `Dhiraj45/cosmopedia-v2` dataset. This dataset is known for its high-quality synthetic data covering various academic and general knowledge topics.
+
+## Training Hyperparameters
+- **Epochs:** 1
+- **Max Steps:** 1000
+- **Batch Size:** 2 (with Gradient Accumulation Steps: 8)
+- **Learning Rate:** 5e-5
+- **Optimizer:** AdamW (fused)
+- **Precision:** `bf16`
+- **Max Sequence Length:** 512 tokens
+
+## How to use
+You can use this model directly with a pipeline for text generation:
+
+```python
+from transformers import pipeline
+
+generator = pipeline("text-generation", model="iko-01//CosmoGPT2-Mini")
+prompt = "The concept of gravity can be explained as"
+result = generator(prompt, max_length=100, num_return_sequences=1)
+
+print(result[0]['generated_text'])
+```
+
+## Intended Use & Limitations
+- **Intended Use:** Experimental purposes, educational text generation, and studying fine-tuning on synthetic data.
+- **Limitations:** Since this is a small version (GPT-2) and trained on a limited subset (30k samples), it may still generate hallucinations or repetitive text. It is not intended for production-level academic advice.
+
+## Training Results
+The model was trained on a T4 GPU (or equivalent) using optimized settings. 
+- **Final Training Loss:** [2.837890]
+- **Evaluation Loss:** [2.686130]
+
+---
+**Note:** This model is part of a training experiment using the Cosmopedia dataset.
+```