GPT-5-Distill-Qwen3-4B-Inst…/README.md

---
tags:
- gguf
- llama.cpp
license: apache-2.0
datasets:
- Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507
language:
- en
- zh
base_model:
- Qwen/Qwen3-4B-Instruct-2507
---


# GPT-5-Distill-Qwen3-4B-Instruct-2507

![Base Model](https://img.shields.io/badge/Base_Model-Qwen3--4B--Instruct-0088CC?style=flat)
![Distillation](https://img.shields.io/badge/Distillation-GPT--5_Responses-8A2BE2?style=flat)
![Language](https://img.shields.io/badge/Language-English_%7C_Chinese-blue?style=flat)
![Context](https://img.shields.io/badge/Context-32K_Tokens-success?style=flat)
![Format](https://img.shields.io/badge/Format-GGUF_%7C_llama.cpp-yellow?style=flat)
![License](https://img.shields.io/badge/License-Apache_2.0-green?style=flat)

<img src="https://cdn-uploads.huggingface.co/production/uploads/66309bd090589b7c65950665/sk5gVFD15S0UNMek3gU0o.png" width="800"/>

<img src="https://cdn-uploads.huggingface.co/production/uploads/66309bd090589b7c65950665/vGzi5hSHJJ72ysJuM5EAv.png" width="800"/>

<img src="https://cdn-uploads.huggingface.co/production/uploads/66309bd090589b7c65950665/j39PSDVoQmK4EI9pLANpa.png" width="800"/>

**Model Type**: Instruction-tuned conversational LLM
  Supports LoRA adapters and full-finetuned models for inference
- **Base Model**: `Qwen/Qwen3-4B-Instruct-2507`
- **Parameters**:  4B
- **Training Method**:
  - Supervised Fine-Tuning (SFT) on ShareGPT data
  - Knowledge distillation from LMSYS GPT-5 responses
- **Supported Languages**: Chinese, English, mixed inputs/outputs
- **Max Context Length**: Up to **32K tokens** (`max_seq_length = 32768`)

This model is trained on ShareGPT-Qwen3 instruction datasets and distilled toward the conversational style and quality of GPT-5. It aims to achieve high-quality, natural-sounding dialogues with low computational overhead—perfect for lightweight applications without sacrificing responsiveness.

---

## 2. Intended Use Cases

### ✅ Recommended:

- Casual chat in Chinese/English
- General knowledge explanations & reasoning guidance
- Code suggestions and simple debugging tips
- Writing assistance: editing, summarizing, rewriting
- Role-playing conversations (with well-designed prompts)

### ⚠️ Not Suitable For:

- High-risk decision-making:
  - Medical diagnosis, mental health support
  - Legal advice, financial investment recommendations
- Real-time factual tasks (e.g., news, stock updates)
- Authoritative judgment on sensitive topics

> **Note**: Outputs are for reference only and not intended as the sole basis for critical decisions.

---

## 3. Training Data & Distillation Process

### Key Datasets:

#### (1) ds1: ShareGPT-Qwen3 Instruction Dataset
- Source: `Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507`
- Purpose:
  - Provides diverse instruction-response pairs
  - Supports multi-turn dialogues and context awareness
- Processing:
  - Cleaned for quality and relevance
  - Standardized into `instruction`, `input`, `output` format

#### (2) ds2: LMSYS GPT-5 Teacher Response Data
- Source: `ytz20/LMSYS-Chat-GPT-5-Chat-Response`
- Filtering:
  - Only kept samples with `flaw == "normal"`
  - Removed hallucinations and inconsistent responses
- Purpose:
  - Distillation target for conversational quality
  - Enhances clarity, coherence, and fluency

### Training Flow:

1. Prepare unified Chat-formatted dataset
2. Fine-tune base Qwen3-4B-Instruct-2507 via SFT
3. Conduct knowledge distillation using GPT-5's normal responses as teacher outputs
4. Balance style imitation with semantic fidelity to ensure robustness

> ⚖️ **Note**: This work is based on publicly available, non-sensitive datasets and uses them responsibly under fair use principles.

---

## 4. Key Features Summary

| Feature | Description |
|--------|-------------|
| **Lightweight** | ~4B parameter model – fast inference, low resource usage |
| **Distillation-Style Responses** | Mimics GPT-5’s conversational fluency and helpfulness |
| **Highly Conversational** | Excellent for chatbot-style interactions with rich dialogue flow |
| **Multilingual Ready** | Seamless support for Chinese and English |

---

## 5. Acknowledgements

We thank:
- LMSYS team for sharing GPT-5 response data
- Jackrong for the ShareGPT-Qwen3 dataset
- Qwen team for releasing `Qwen3-4B-Instruct`

This project is an open research effort aimed at making high-quality conversational AI accessible with smaller models.

---