language, license, base_model, tags, datasets, pipeline_tag
language license base_model tags datasets pipeline_tag
en
apache-2.0 HuggingFaceTB/SmolLM2-135M-Instruct
llm
fine-tuned
lora
sft
text-generation
student-project
HuggingFaceTB/smoltalk
text-generation

chatOP — SmolLM2-135M Fine-tuned

A fine-tuned version of SmolLM2-135M-Instruct trained to act as a friendly study assistant for CS and ML concepts.

Model Details

Base model HuggingFaceTB/SmolLM2-135M-Instruct
Model type Causal Language Model
Fine-tuning method SFT + LoRA
Language English
License Apache 2.0

Training Details

Dataset

Hyperparameters

Parameter Value
Learning rate 2e-4
Training steps 200
Batch size 2
Packing True
Optimizer AdamW
LR scheduler Linear
Precision fp16

LoRA Config

Parameter Value
Rank (r) 16
Alpha (lora_alpha) 16
Dropout 0.05
Target modules all-linear
Bias none
Trainable parameters ~1.7M / 135M (1.26%)

Hardware

  • GPU: NVIDIA Tesla T4 (Google Colab)
  • Training time: ~1520 minutes

Example Outputs

Prompt: What is machine learning?

Machine learning is a field of AI where systems learn from data to make predictions or decisions without being explicitly programmed for each task.

Prompt: Explain transformers in simple terms.

A Transformer is a neural network architecture that uses attention mechanisms to understand relationships between all words in a sentence simultaneously, rather than reading them one by one like older models did.

Prompt: What is the difference between LoRA and full fine-tuning?

Full fine-tuning updates all parameters of a model which requires a lot of memory. LoRA instead freezes the base model and adds small trainable matrices to each layer, training only ~1% of parameters while achieving similar results.

Limitations

  • Small model (135M parameters) — responses may be less detailed than larger models
  • Trained for only 200 steps — longer training would improve quality
  • Best for short explanations — may struggle with very complex multi-step reasoning
  • Not suitable for factual lookups — may hallucinate specific facts or numbers

Training Framework

Built as a learning project while studying the HuggingFace LLM Course — specifically Chapter 11 (Fine-tuning LLMs).

Libraries used:

  • 🤗 Transformers
  • 🤗 PEFT
  • TRL (SFTTrainer)
  • 🤗 Datasets
  • Accelerate

Author

Made by puravky — undergrad student exploring ML and AI.

Description
Model synced from source: puravky/chatOP
Readme 764 KiB
Languages
Jinja 100%