初始化项目，由ModelHub XC社区提供模型

Model: Abhinav-Anand/My-Brain-Hurts-Help Source: Original Platform
2026-05-20 22:05:56 +08:00
commit 887d90df49
10 changed files with 300509 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,81 @@
+---
+license: mit
+base_model: distilbert/distilgpt2
+tags:
+  - fine-tuned
+  - eli5
+  - explain-like-im-5
+  - education
+  - simple-explanations
+  - distilgpt2
+language:
+  - en
+pipeline_tag: text-generation
+---
+
+# DistilGPT2-MyBrainHurts (Full Fine-tune)
+
+## Overview
+A **fully fine-tuned** version of DistilGPT2 (82M parameters) specialized in explaining
+complex topics in simple, child-friendly language ("Explain Like I'm 5" style).
+Unlike LoRA adapters, ALL model weights have been updated during training, making this
+a completely specialized model.
+
+## Key Features
+- **Ultra-small**: Only ~312 MB total
+- **Specialized**: All 82M parameters tuned for simple explanations
+- **25 topics**: Trained on science, nature, technology, and everyday phenomena
+- **Child-friendly**: Uses analogies and simple vocabulary
+
+## Topics Covered
+Gravity, Internet, Sky color, Photosynthesis, Electricity, Dinosaurs, Moon, Rain,
+Sleep, Magnets, Clouds, Leaf colors, Volcanoes, Oceans, Airplanes, Robots, Seasons,
+Sound, Stars, Computers, DNA, Bacteria, Rainbows, Ice cream melting, Thunder & Lightning
+
+## Usage
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model = AutoModelForCausalLM.from_pretrained("Ringkvist/DistilGPT2-MyBrainHurts")
+tokenizer = AutoTokenizer.from_pretrained("Ringkvist/DistilGPT2-MyBrainHurts")
+
+prompt = "Explain black holes like I'm 5:"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=150,
+    temperature=0.7,
+    top_p=0.9,
+    repetition_penalty=1.2,
+)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+## Training Details
+- **Method**: Full fine-tuning (all parameters)
+- **Base model**: [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) (82M params)
+- **Dataset**: 25 hand-crafted ELI5 explanations
+- **Epochs**: 20
+- **Learning rate**: 5e-5 with cosine schedule
+- **Batch size**: 2 (x4 gradient accumulation = effective 8)
+- **Hardware**: Apple Silicon Mac (CPU/MPS)
+
+## Full Fine-tune vs LoRA
+| Aspect | Full Fine-tune | LoRA |
+|--------|---------------|------|
+| Modified params | ALL (82M) | ~0.5% |
+| Upload size | Full model (~312 MB) | Small adapter (~1-2 MB) |
+| Base model needed | No | Yes |
+| Specialization | Deeper | Surface-level |
+| Training time | Longer | Shorter |
+| Risk of forgetting | Higher | Lower |
+
+## Limitations
+- Small model (82M params) limits output quality
+- Trained on limited examples - may not generalize to all topics
+- Full fine-tuning means some base capabilities may be reduced (catastrophic forgetting)
+- Best used as a demonstration/educational project
+
+## Base Model
+- [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) - 82M parameter distilled GPT-2