--- library_name: transformers license: apache-2.0 datasets: - sahil2801/CodeAlpaca-20k base_model: - meta-llama/Llama-3.2-1B --- # ๐Ÿง  Llama-3.2-1B Code Solver (QLoRA Fine-Tuned) A lightweight yet powerful **code-focused language model** fine-tuned from **Meta Llama-3.2-1B** using **QLoRA (4-bit)** on the **CodeAlpaca-20K** dataset. Designed for **efficient code generation, reasoning, and problem-solving** on limited GPU resources. > ๐Ÿš€ Trained on a single Tesla P100 GPU > โšก Optimized for Kaggle, Colab, and low-VRAM environments > ๐Ÿงฉ Ideal for research, education, and rapid prototyping --- ## ๐Ÿ” Model Overview | Attribute | Value | |---------|------| | **Base Model** | `meta-llama/Llama-3.2-1B` | | **Model Type** | Decoder-only causal language model | | **Fine-Tuning Method** | QLoRA (4-bit quantization + LoRA) | | **LoRA Rank** | 16 | | **Task Domain** | Code generation & code reasoning | | **Training Samples** | 10,000 | | **Training Time** | ~5 hours | | **Hardware** | NVIDIA Tesla P100 | | **Precision** | 4-bit (NF4) | | **Frameworks** | Hugging Face Transformers, PEFT, BitsAndBytes | --- ## ๐ŸŽฏ What This Model Is Good At - ๐Ÿง‘โ€๐Ÿ’ป Code generation (Python-focused, but generalizable) - ๐Ÿง  Step-by-step coding reasoning - ๐Ÿงช Algorithmic problem solving - ๐Ÿ“˜ Educational coding assistance - โš™๏ธ Running efficiently on **low-VRAM GPUs** --- ## ๐Ÿ“š Training Dataset ### **CodeAlpaca-20K** A high-quality instruction-tuning dataset derived from the Alpaca format and specialized for coding tasks. - **Total dataset size**: 20,000 samples - **Used for training**: 10,000 samples (50%) - **Data format**: ```json { "instruction": "Describe the coding task", "input": "Optional context or input code", "output": "Expected code solution" } ``` * **Task Types**: * Algorithm implementation * Code completion * Debugging * Function writing * Problem solving --- ## ๐Ÿ—๏ธ Training Methodology This model was fine-tuned using **QLoRA**, enabling efficient adaptation of large language models on limited hardware. ### Key Techniques Used * **4-bit Quantization (NF4)** via BitsAndBytes * **LoRA adapters** applied to attention layers * **Frozen base model weights** * **Low-rank updates only** ### Why QLoRA? * ๐Ÿ”ป Drastically reduces GPU memory usage * โšก Enables training on consumer-grade GPUs * ๐Ÿ“ˆ Maintains strong downstream performance --- ## โš™๏ธ Training Configuration | Parameter | Value | | --------------------- | ----------------------- | | Max Sequence Length | 1024 | | LoRA Rank (r) | 16 | | LoRA Alpha | 32 | | LoRA Dropout | 0.05 | | Optimizer | AdamW | | Learning Rate | 2e-4 | | Batch Size | Small (GPU-constrained) | | Gradient Accumulation | Enabled | | Quantization | 4-bit | --- ## ๐Ÿš€ Usage ### Load the Model ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "YOUR_USERNAME/llama-3.2-1b-code-solver" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", load_in_4bit=True ) ``` ### Example Inference ```python prompt = "Write a Python function to check if a number is prime." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=200) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## ๐Ÿงช Evaluation Notes * This model is **instruction-tuned**, not benchmark-optimized * No formal benchmarks (HumanEval / MBPP) were run * Best evaluated through **qualitative code generation** ## โš ๏ธ Limitations * 1B parameters โ†’ limited long-context reasoning * Not optimized for natural language chat * May hallucinate on complex or ambiguous prompts * English-centric training data ## ๐Ÿงญ Intended Use โœ… **Allowed** * Research and experimentation * Coding assistants * Educational tools * Prototyping LLM systems ## ๐Ÿ™ Acknowledgements * **Meta AI** for Llama 3.2 * **CodeAlpaca** dataset creators * **Hugging Face** ecosystem * **QLoRA & PEFT** authors