120 lines
4.3 KiB
Markdown
120 lines
4.3 KiB
Markdown
|
|
---
|
|||
|
|
tags:
|
|||
|
|
- gguf
|
|||
|
|
- llama.cpp
|
|||
|
|
license: apache-2.0
|
|||
|
|
datasets:
|
|||
|
|
- Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507
|
|||
|
|
language:
|
|||
|
|
- en
|
|||
|
|
- zh
|
|||
|
|
base_model:
|
|||
|
|
- Qwen/Qwen3-4B-Instruct-2507
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
|
|||
|
|
# GPT-5-Distill-Qwen3-4B-Instruct-2507
|
|||
|
|
|
|||
|
|

|
|||
|
|

|
|||
|
|

|
|||
|
|

|
|||
|
|

|
|||
|
|

|
|||
|
|
|
|||
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/66309bd090589b7c65950665/sk5gVFD15S0UNMek3gU0o.png" width="800"/>
|
|||
|
|
|
|||
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/66309bd090589b7c65950665/vGzi5hSHJJ72ysJuM5EAv.png" width="800"/>
|
|||
|
|
|
|||
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/66309bd090589b7c65950665/j39PSDVoQmK4EI9pLANpa.png" width="800"/>
|
|||
|
|
|
|||
|
|
**Model Type**: Instruction-tuned conversational LLM
|
|||
|
|
Supports LoRA adapters and full-finetuned models for inference
|
|||
|
|
- **Base Model**: `Qwen/Qwen3-4B-Instruct-2507`
|
|||
|
|
- **Parameters**: 4B
|
|||
|
|
- **Training Method**:
|
|||
|
|
- Supervised Fine-Tuning (SFT) on ShareGPT data
|
|||
|
|
- Knowledge distillation from LMSYS GPT-5 responses
|
|||
|
|
- **Supported Languages**: Chinese, English, mixed inputs/outputs
|
|||
|
|
- **Max Context Length**: Up to **32K tokens** (`max_seq_length = 32768`)
|
|||
|
|
|
|||
|
|
This model is trained on ShareGPT-Qwen3 instruction datasets and distilled toward the conversational style and quality of GPT-5. It aims to achieve high-quality, natural-sounding dialogues with low computational overhead—perfect for lightweight applications without sacrificing responsiveness.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Intended Use Cases
|
|||
|
|
|
|||
|
|
### ✅ Recommended:
|
|||
|
|
|
|||
|
|
- Casual chat in Chinese/English
|
|||
|
|
- General knowledge explanations & reasoning guidance
|
|||
|
|
- Code suggestions and simple debugging tips
|
|||
|
|
- Writing assistance: editing, summarizing, rewriting
|
|||
|
|
- Role-playing conversations (with well-designed prompts)
|
|||
|
|
|
|||
|
|
### ⚠️ Not Suitable For:
|
|||
|
|
|
|||
|
|
- High-risk decision-making:
|
|||
|
|
- Medical diagnosis, mental health support
|
|||
|
|
- Legal advice, financial investment recommendations
|
|||
|
|
- Real-time factual tasks (e.g., news, stock updates)
|
|||
|
|
- Authoritative judgment on sensitive topics
|
|||
|
|
|
|||
|
|
> **Note**: Outputs are for reference only and not intended as the sole basis for critical decisions.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Training Data & Distillation Process
|
|||
|
|
|
|||
|
|
### Key Datasets:
|
|||
|
|
|
|||
|
|
#### (1) ds1: ShareGPT-Qwen3 Instruction Dataset
|
|||
|
|
- Source: `Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507`
|
|||
|
|
- Purpose:
|
|||
|
|
- Provides diverse instruction-response pairs
|
|||
|
|
- Supports multi-turn dialogues and context awareness
|
|||
|
|
- Processing:
|
|||
|
|
- Cleaned for quality and relevance
|
|||
|
|
- Standardized into `instruction`, `input`, `output` format
|
|||
|
|
|
|||
|
|
#### (2) ds2: LMSYS GPT-5 Teacher Response Data
|
|||
|
|
- Source: `ytz20/LMSYS-Chat-GPT-5-Chat-Response`
|
|||
|
|
- Filtering:
|
|||
|
|
- Only kept samples with `flaw == "normal"`
|
|||
|
|
- Removed hallucinations and inconsistent responses
|
|||
|
|
- Purpose:
|
|||
|
|
- Distillation target for conversational quality
|
|||
|
|
- Enhances clarity, coherence, and fluency
|
|||
|
|
|
|||
|
|
### Training Flow:
|
|||
|
|
|
|||
|
|
1. Prepare unified Chat-formatted dataset
|
|||
|
|
2. Fine-tune base Qwen3-4B-Instruct-2507 via SFT
|
|||
|
|
3. Conduct knowledge distillation using GPT-5's normal responses as teacher outputs
|
|||
|
|
4. Balance style imitation with semantic fidelity to ensure robustness
|
|||
|
|
|
|||
|
|
> ⚖️ **Note**: This work is based on publicly available, non-sensitive datasets and uses them responsibly under fair use principles.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. Key Features Summary
|
|||
|
|
|
|||
|
|
| Feature | Description |
|
|||
|
|
|--------|-------------|
|
|||
|
|
| **Lightweight** | ~4B parameter model – fast inference, low resource usage |
|
|||
|
|
| **Distillation-Style Responses** | Mimics GPT-5’s conversational fluency and helpfulness |
|
|||
|
|
| **Highly Conversational** | Excellent for chatbot-style interactions with rich dialogue flow |
|
|||
|
|
| **Multilingual Ready** | Seamless support for Chinese and English |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Acknowledgements
|
|||
|
|
|
|||
|
|
We thank:
|
|||
|
|
- LMSYS team for sharing GPT-5 response data
|
|||
|
|
- Jackrong for the ShareGPT-Qwen3 dataset
|
|||
|
|
- Qwen team for releasing `Qwen3-4B-Instruct`
|
|||
|
|
|
|||
|
|
This project is an open research effort aimed at making high-quality conversational AI accessible with smaller models.
|
|||
|
|
|
|||
|
|
---
|