Go to file

ModelHub XC 887d90df49 初始化项目，由ModelHub XC社区提供模型

Model: Abhinav-Anand/My-Brain-Hurts-Help
Source: Original Platform

2026-05-20 22:05:56 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-20 22:05:56 +08:00

README.md

license, base_model, tags, language, pipeline_tag

license

base_model

DistilGPT2-MyBrainHurts (Full Fine-tune)

Overview

A fully fine-tuned version of DistilGPT2 (82M parameters) specialized in explaining complex topics in simple, child-friendly language ("Explain Like I'm 5" style). Unlike LoRA adapters, ALL model weights have been updated during training, making this a completely specialized model.

Key Features

Ultra-small: Only ~312 MB total
Specialized: All 82M parameters tuned for simple explanations
25 topics: Trained on science, nature, technology, and everyday phenomena
Child-friendly: Uses analogies and simple vocabulary

Topics Covered

Gravity, Internet, Sky color, Photosynthesis, Electricity, Dinosaurs, Moon, Rain, Sleep, Magnets, Clouds, Leaf colors, Volcanoes, Oceans, Airplanes, Robots, Seasons, Sound, Stars, Computers, DNA, Bacteria, Rainbows, Ice cream melting, Thunder & Lightning

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Ringkvist/DistilGPT2-MyBrainHurts")
tokenizer = AutoTokenizer.from_pretrained("Ringkvist/DistilGPT2-MyBrainHurts")

prompt = "Explain black holes like I'm 5:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_new_tokens=150,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.2,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Method: Full fine-tuning (all parameters)
Base model: distilbert/distilgpt2 (82M params)
Dataset: 25 hand-crafted ELI5 explanations
Epochs: 20
Learning rate: 5e-5 with cosine schedule
Batch size: 2 (x4 gradient accumulation = effective 8)
Hardware: Apple Silicon Mac (CPU/MPS)

Full Fine-tune vs LoRA

Aspect	Full Fine-tune	LoRA
Modified params	ALL (82M)	~0.5%
Upload size	Full model (~312 MB)	Small adapter (~1-2 MB)
Base model needed	No	Yes
Specialization	Deeper	Surface-level
Training time	Longer	Shorter
Risk of forgetting	Higher	Lower

Limitations

Small model (82M params) limits output quality
Trained on limited examples - may not generalize to all topics
Full fine-tuning means some base capabilities may be reduced (catastrophic forgetting)
Best used as a demonstration/educational project

Base Model

distilbert/distilgpt2 - 82M parameter distilled GPT-2