language, license, base_model, tags, datasets, pipeline_tag
| language | license | base_model | tags | datasets | pipeline_tag | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
apache-2.0 | HuggingFaceTB/SmolLM2-135M-Instruct |
|
|
text-generation |
chatOP — SmolLM2-135M Fine-tuned
A fine-tuned version of SmolLM2-135M-Instruct trained to act as a friendly study assistant for CS and ML concepts.
Model Details
| Base model | HuggingFaceTB/SmolLM2-135M-Instruct |
| Model type | Causal Language Model |
| Fine-tuning method | SFT + LoRA |
| Language | English |
| License | Apache 2.0 |
Training Details
Dataset
- Name: HuggingFaceTB/smoltalk
- Split used:
all - Format: Conversational chat messages (system, user, assistant)
Hyperparameters
| Parameter | Value |
|---|---|
| Learning rate | 2e-4 |
| Training steps | 200 |
| Batch size | 2 |
| Packing | True |
| Optimizer | AdamW |
| LR scheduler | Linear |
| Precision | fp16 |
LoRA Config
| Parameter | Value |
|---|---|
Rank (r) |
16 |
Alpha (lora_alpha) |
16 |
| Dropout | 0.05 |
| Target modules | all-linear |
| Bias | none |
| Trainable parameters | ~1.7M / 135M (1.26%) |
Hardware
- GPU: NVIDIA Tesla T4 (Google Colab)
- Training time: ~15–20 minutes
Example Outputs
Prompt: What is machine learning?
Machine learning is a field of AI where systems learn from data to make predictions or decisions without being explicitly programmed for each task.
Prompt: Explain transformers in simple terms.
A Transformer is a neural network architecture that uses attention mechanisms to understand relationships between all words in a sentence simultaneously, rather than reading them one by one like older models did.
Prompt: What is the difference between LoRA and full fine-tuning?
Full fine-tuning updates all parameters of a model which requires a lot of memory. LoRA instead freezes the base model and adds small trainable matrices to each layer, training only ~1% of parameters while achieving similar results.
Limitations
- Small model (135M parameters) — responses may be less detailed than larger models
- Trained for only 200 steps — longer training would improve quality
- Best for short explanations — may struggle with very complex multi-step reasoning
- Not suitable for factual lookups — may hallucinate specific facts or numbers
Training Framework
Built as a learning project while studying the HuggingFace LLM Course — specifically Chapter 11 (Fine-tuning LLMs).
Libraries used:
- 🤗 Transformers
- 🤗 PEFT
- TRL (SFTTrainer)
- 🤗 Datasets
- Accelerate
Author
Made by puravky — undergrad student exploring ML and AI.