初始化项目,由ModelHub XC社区提供模型
Model: khazarai/Qwen3-4B-Qwen3.6-plus-Reasoning-Distilled-GGUF Source: Original Platform
This commit is contained in:
87
README.md
Normal file
87
README.md
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
tags:
|
||||
- gguf
|
||||
- llama.cpp
|
||||
- unsloth
|
||||
license: apache-2.0
|
||||
datasets:
|
||||
- khazarai/qwen3.6-plus-high-reasoning-500x
|
||||
language:
|
||||
- en
|
||||
base_model:
|
||||
- khazarai/Qwen3-4B-Qwen3.6-plus-Reasoning-Distilled
|
||||
pipeline_tag: text-generation
|
||||
metrics:
|
||||
- accuracy
|
||||
---
|
||||
|
||||
# Qwen3-4B-Qwen3.6-plus-Reasoning-Distilled-GGUF : GGUF
|
||||
|
||||
|
||||
## Model: khazarai/Qwen3-4B-Qwen3.6-plus-Reasoning-Distilled
|
||||
|
||||

|
||||
|
||||
- **Success Rate**: 75.64%
|
||||
|
||||
## Model: Qwen/Qwen3-4B-Thinking-2507
|
||||
|
||||

|
||||
|
||||
- **Success Rate**: 73.73%
|
||||
|
||||
- **Benchmark**: khazarai/Multi-Domain-Reasoning-Benchmark
|
||||
- **Total Questions**: 100
|
||||
|
||||
|
||||
This is a reasoning-distilled variant of Qwen3-4B-Thinking, fine-tuned using LoRA via Unsloth to replicate the advanced reasoning capabilities of the larger Qwen3.6-plus teacher model.
|
||||
The distillation process focuses on reducing the "rambling" and "uncertainty" often found in smaller models during complex tasks, replacing them with concise, structured, and actionable solution paths.
|
||||
|
||||
## Reasoning Comparison: Base vs. Distilled
|
||||
|
||||
The primary improvement in this model is the qualitative leap in reasoning structure. Below is a summary of the differences observed when solving complex graph problems (e.g., Shortest Path with Edge Reversals):
|
||||
|
||||
**Base Model (Qwen3-4B-Thinking)**:
|
||||
|
||||
- Style: Stream-of-consciousness, exploratory, and verbose.
|
||||
- Behavior: The model often talks to itself ("Hmm, interesting", "Wait, no"), struggles to interpret problem constraints correctly on the first try, and enters loops of self-correction. It mimics a student trying to figure out the problem as they speak.
|
||||
- Output: Contains high noise-to-signal ratio; solution paths are often buried under paragraphs of hesitation.
|
||||
|
||||
**Distilled Model (Qwen3-4B-Qwen3.6-plus-Reasoning-Distilled)**:
|
||||
|
||||
- Style: Structured, professional, and report-oriented.
|
||||
- Behavior: The model analyzes the problem immediately, separates concerns (Input, Output, Constraints), and formulates a concrete algorithm plan (e.g., State-Space Dijkstra). It proceeds with confidence, avoiding logical dead-ends.
|
||||
- Output: Provides a clean breakdown: Problem Analysis -> Intuition -> Algorithm -> Complexity Analysis -> Pseudocode.
|
||||
|
||||
**Verdict**: The distilled model transforms the raw potential of the base model into an engineering-grade tool.
|
||||
|
||||
|
||||
## Model Specifications
|
||||
|
||||
- **Base Model**: Qwen/Qwen3-4B-Thinking-2507
|
||||
- **Model Type**: Reasoning Distillation (QLoRA)
|
||||
- **Framework**: Unsloth
|
||||
- **Fine-tuning Method**: QLoRA (PEFT)
|
||||
- **Teacher Model**: Qwen3.6-plus
|
||||
- **Distillation Dataset**: khazarai/qwen3.6-plus-high-reasoning-500x
|
||||
- Total Tokens: 1,739,249
|
||||
- Max Sequence Length: 6,500 tokens
|
||||
|
||||
## Provided Quants
|
||||
|
||||
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
|
||||
|
||||
| Type | Size/GB | Notes |
|
||||
|:-----|--------:|:------|
|
||||
| Q4_K_1 | 2.3 | |
|
||||
| Q6_K | 3.3 | very good quality |
|
||||
| Q8_0 | 4.2 | fast, best quality |
|
||||
| bf16 | 8.0 | 16 bpw, overkill |
|
||||
|
||||
Here is a handy graph by ikawrakow comparing some lower-quality quant
|
||||
types (lower is better):
|
||||
|
||||

|
||||
|
||||
And here are Artefact2's thoughts on the matter:
|
||||
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
|
||||
Reference in New Issue
Block a user