--- language: en license: apache-2.0 library_name: transformers tags: - tinyllama - peft - merged - lora - fine-tuned - pytorch - text-generation base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 datasets: - arif-butt/arifbutt_dataset pipeline_tag: text-generation --- # 🦙 TinyLlama PEFT Merged > **A fully merged, production-ready TinyLlama model fine-tuned with PEFT LoRA** --- ## 📌 Quick Facts | Attribute | Value | |-----------|-------| | **Model Size** | 2.2 GB | | **Parameters** | 1.1 Billion | | **Format** | PyTorch Safetensors | | **Precision** | FP16 | | **Context** | 2048 tokens | | **Training Framework** | PEFT + TRL | | **Inference** | No PEFT required | --- ## 🚀 Quick Start ```python from transformers import AutoTokenizer, AutoModelForCausalLM # One-liner to load tokenizer = AutoTokenizer.from_pretrained("arif-butt/tinyllama-peft-merged") model = AutoModelForCausalLM.from_pretrained( "arif-butt/tinyllama-peft-merged", torch_dtype=torch.float16, device_map="auto" ) # Generate prompt = "Q: What courses does Arif teach?\nA:" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) 📦 What's Inside tinyllama-peft-merged/ ├── model.safetensors # 2.2 GB — merged weights ├── config.json # Model architecture ├── generation_config.json # Default generation settings ├── tokenizer.json # Vocabulary (1.76 MB) ├── tokenizer_config.json # Tokenizer settings └── special_tokens_map.json # Special tokens No adapter files. No PEFT needed. Just load and go. 🔧 Generation Settings outputs = model.generate( **inputs, max_new_tokens=150, temperature=0.7, top_p=0.95, do_sample=True, repetition_penalty=1.1, pad_token_id=tokenizer.eos_token_id, ) 💬 Prompt Format Q: Your question here? A: Example: Q: What is deep learning? A: Deep learning is a subset of machine learning... Q: What is Python? A: Python is a high-level, interpreted programming language known for its simple, readable syntax. It supports multiple programming paradigms including object-oriented, imperative, and functional programming. Q: Explain gradient descent A: Gradient descent is an optimization algorithm used to minimize the loss function in machine learning models. It works by iteratively moving parameters in the direction of the negative gradient. Q: Name Arif's courses A: Dr. Muhammad Arif Butt teaches Python Programming, Data Structures & Algorithms, Machine Learning, and Deep Learning courses. Epoch 1: ████████████████░░░░ 0.8 Epoch 2: ████████████████████ 0.4 Epoch 3: ████████████████████ 0.05