--- language: - en license: apache-2.0 base_model: HuggingFaceTB/SmolLM2-135M-Instruct tags: - llm - fine-tuned - lora - sft - text-generation - student-project datasets: - HuggingFaceTB/smoltalk pipeline_tag: text-generation --- # chatOP — SmolLM2-135M Fine-tuned A fine-tuned version of [SmolLM2-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) trained to act as a friendly study assistant for CS and ML concepts. ## Model Details | | | |---|---| | **Base model** | HuggingFaceTB/SmolLM2-135M-Instruct | | **Model type** | Causal Language Model | | **Fine-tuning method** | SFT + LoRA | | **Language** | English | | **License** | Apache 2.0 | ## Training Details ### Dataset - **Name:** [HuggingFaceTB/smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk) - **Split used:** `all` - **Format:** Conversational chat messages (system, user, assistant) ### Hyperparameters | Parameter | Value | |---|---| | Learning rate | 2e-4 | | Training steps | 200 | | Batch size | 2 | | Packing | True | | Optimizer | AdamW | | LR scheduler | Linear | | Precision | fp16 | ### LoRA Config | Parameter | Value | |---|---| | Rank (`r`) | 16 | | Alpha (`lora_alpha`) | 16 | | Dropout | 0.05 | | Target modules | all-linear | | Bias | none | | Trainable parameters | ~1.7M / 135M (1.26%) | ### Hardware - **GPU:** NVIDIA Tesla T4 (Google Colab) - **Training time:** ~15–20 minutes ## Example Outputs **Prompt:** What is machine learning? > Machine learning is a field of AI where systems learn from data to make predictions or decisions without being explicitly programmed for each task. **Prompt:** Explain transformers in simple terms. > A Transformer is a neural network architecture that uses attention mechanisms to understand relationships between all words in a sentence simultaneously, rather than reading them one by one like older models did. **Prompt:** What is the difference between LoRA and full fine-tuning? > Full fine-tuning updates all parameters of a model which requires a lot of memory. LoRA instead freezes the base model and adds small trainable matrices to each layer, training only ~1% of parameters while achieving similar results. ## Limitations - Small model (135M parameters) — responses may be less detailed than larger models - Trained for only 200 steps — longer training would improve quality - Best for short explanations — may struggle with very complex multi-step reasoning - Not suitable for factual lookups — may hallucinate specific facts or numbers ## Training Framework Built as a learning project while studying the [HuggingFace LLM Course](https://huggingface.co/learn/llm-course) — specifically Chapter 11 (Fine-tuning LLMs). **Libraries used:** - 🤗 Transformers - 🤗 PEFT - TRL (SFTTrainer) - 🤗 Datasets - Accelerate ## Author Made by [puravky](https://huggingface.co/puravky) — undergrad student exploring ML and AI.