Model: hareeswar/Distilled-Qwen-1.5B-Coder Source: Original Platform
language, license, pipeline_tag, base_model
| language | license | pipeline_tag | base_model | |
|---|---|---|---|---|
|
apache-2.0 | text-generation | Qwen/Qwen2.5-Coder-1.5B-Instruct |
LoRA Distillation Evaluation Report
1. Executive Summary
This report outlines the final evaluation metrics of the reasoning distillation pipeline. By fine-tuning a 1.5B parameter base model on the Chain-of-Thought (CoT) outputs of a 7B parameter teacher model, we achieved a +15.3% absolute improvement in autonomous coding capabilities.
2. Model Comparison
| Model | Role | Average Pass Rate |
|---|---|---|
| Qwen2.5-Coder-7B (Teacher) | Dataset Generator | 96.9% |
| Qwen2.5-Coder-1.5B (Base) | Baseline Coder | 64.5% |
| Qwen2.5-Coder-1.5B (Distilled/LoRA) | Distilled Agent | 79.8% |
3. Key Observations & Analysis
The Base Model's Weakness
The un-trained 1.5B base model demonstrated a tendency to rush into code generation, resulting in brittle algorithms that failed edge cases. While it occasionally "cheated" using built-in Python functions (e.g., using .sort() for O(log n) requirements), its structural logic failed on complex Dynamic Programming and boundary checks.
The LoRA Model's Strength (Distilled Reasoning)
By injecting [REASONING] tokens during Supervised Fine-Tuning (SFT), the LoRA adapter successfully forced the 1.5B model to adopt a "think-before-acting" paradigm.
- It achieved near-perfect scores (95%+) on complex algorithmic edge cases.
- It demonstrated active problem deconstruction before writing Python code.
- Overall Delta: A massive +10 problems fully solved, bringing the baseline from 64.5% to 79.8%.