初始化项目,由ModelHub XC社区提供模型
Model: hareeswar/Distilled-Qwen-1.5B-Coder Source: Original Platform
This commit is contained in:
37
README.md
Normal file
37
README.md
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
language:
|
||||
- en
|
||||
license: apache-2.0
|
||||
pipeline_tag: text-generation
|
||||
base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
|
||||
---
|
||||
|
||||
# LoRA Distillation Evaluation Report
|
||||
|
||||
## 1. Executive Summary
|
||||
This report outlines the final evaluation metrics of the reasoning distillation pipeline. By fine-tuning a 1.5B parameter base model on the Chain-of-Thought (CoT) outputs of a 7B parameter teacher model, we achieved a **+15.3%** absolute improvement in autonomous coding capabilities.
|
||||
|
||||
---
|
||||
|
||||
## 2. Model Comparison
|
||||
|
||||
| Model | Role | Average Pass Rate |
|
||||
| :--- | :--- | :---: |
|
||||
| **Qwen2.5-Coder-7B (Teacher)** | Dataset Generator | **96.9%** |
|
||||
| **Qwen2.5-Coder-1.5B (Base)** | Baseline Coder | **64.5%** |
|
||||
| **Qwen2.5-Coder-1.5B (Distilled/LoRA)** | Distilled Agent | **79.8%** |
|
||||
|
||||
---
|
||||
|
||||
## 3. Key Observations & Analysis
|
||||
|
||||
### The Base Model's Weakness
|
||||
The un-trained 1.5B base model demonstrated a tendency to rush into code generation, resulting in brittle algorithms that failed edge cases. While it occasionally "cheated" using built-in Python functions (e.g., using `.sort()` for O(log n) requirements), its structural logic failed on complex Dynamic Programming and boundary checks.
|
||||
|
||||
### The LoRA Model's Strength (Distilled Reasoning)
|
||||
By injecting `[REASONING]` tokens during Supervised Fine-Tuning (SFT), the LoRA adapter successfully forced the 1.5B model to adopt a "think-before-acting" paradigm.
|
||||
* It achieved near-perfect scores (95%+) on complex algorithmic edge cases.
|
||||
* It demonstrated active problem deconstruction before writing Python code.
|
||||
* **Overall Delta:** A massive +10 problems fully solved, bringing the baseline from 64.5% to 79.8%.
|
||||
|
||||
---
|
||||
Reference in New Issue
Block a user