Files
ModelHub XC c5070a8da9 初始化项目,由ModelHub XC社区提供模型
Model: Jackrong/GPT-5-Distill-llama3.1-8B-Instruct
Source: Original Platform
2026-06-10 12:18:12 +08:00

75 lines
4.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model: meta-llama/Llama-3.1-8B-Instruct
library_name: transformers
model_name: GPT-5-Distill-llama3.1-8B-Instruct
tags:
- unsloth
- llama-3
- llama
- text-generation
- distillation
- gpt-5
license: llama3.1
language:
- en
- zh
---
# GPT-5-Distill-llama3.1-8B-Instruct
![Unsloth](https://img.shields.io/badge/Unsloth-Fine--Tuning-blue?style=flat&logo=unsloth)
![Llama-3](https://img.shields.io/badge/Model-Llama--3.1-green?style=flat)
![Distillation](https://img.shields.io/badge/Technique-Knowledge%20Distillation-orange?style=flat)
## Model Summary
<img src="https://cdn-uploads.huggingface.co/production/uploads/66309bd090589b7c65950665/PNNVeEd1bKdL3F7oXCj5M.png" width="800" />
**GPT-5-Distill-llama3.1-8B-Instruct** is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), designed to distill the capabilities of high-performance models (labeled as GPT-5 in source datasets) into a more efficient 8B parameter footprint.
This model was trained using **Unsloth** on a curated mix of approximately **164,000 high-quality instruction-response pairs**, focusing on complex reasoning and "normal" flaw-level responses.
* **Base Model:** `meta-llama/Llama-3.1-8B-Instruct`
* **Architecture:** Llama 3.1 (8B parameters)
* **Language:** English (Primary)
* **Context Window:** 32,768 tokens
* **Fine-tuning Framework:** [Unsloth](https://github.com/unslothai/unsloth) (QLoRA)
## ✨ Key Advantages of GPT-5 Distillation
This model represents a shift towards **"Super-Knowledge Distillation"**, where a smaller, efficient student model learns from a significantly more capable teacher.
* **🚀 Frontier-Level Reasoning**: By training on dataset samples attributed to GPT-5, the model acquires complex reasoning patterns, nuance, and problem-solving strategies that are typically absent in standard datasets or smaller models.
* **⚡ Efficient Intelligence**: Users can experience high-fidelity, coherent, and detailed responses on consumer hardware (e.g., single GPUs) without the latency, privacy concerns, or cost of querying giant proprietary APIs.
* **💎 High-Purity Signal**: The strict filtering for `flaw == "normal"` ensures the model is fine-tuned only on the highest confidence, error-free responses. This minimizes "hallucination inheritance" and aligns the model with safe, helpful behaviors.
* **🎯 Enhanced Nuance & Tone**: Unlike standard finetunes that often sound robotic, this model mimics the more natural, conversational, and adaptive tone found in next-generation frontier models.
## 📚 Training Data
The model was trained on a high-quality blend of two datasets, totaling **163,896 samples**:
1. **Chat-GPT-5-Chat-Response (160k samples)**
* Filtered specifically for normal entries to ensure high-quality, safe, and coherent responses.
* This dataset serves as the primary distillation source, aiming to mimic the response patterns of advanced large language models.
2. **ShareGPT-Qwen3-235B-A22B-Instuct-2507 (3.9k samples)**
* "This dataset consists of approximately **3.9k examples**, with an average of about **5 rounds of dialogue** per scenario, designed to enhance the models instruction-following ability and task-completion efficiency.
All data was formatted using the standard **Llama-3 Chat Template**.
## ⚙️ Training Details
* **Hardware:** NVIDIA H100
* **Sequence Length:** 32,768 tokens (Long Context Support)
* **Batch Size:** 4 per device (Effective Batch Size: 32 via Gradient Accumulation)
* **Learning Rate:** 2e-5
* **Scheduler:** Linear
* **Optimizer:** AdamW 8-bit
* **LoRA Rank (r):** 32
* **LoRA Alpha:** 32
* **Target Modules:** `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
## 🛡️ License & Limitations
* **License:** This model is subject to the **Llama 3.1 Community License**.
* **Limitations:** While this model is distilled from high-capability sources, it is still an 8B parameter model. It may hallucinate facts or struggle with extremely complex reasoning tasks compared to the original teacher models. The "GPT-5" naming refers to the source dataset labels and does not imply access to unreleased OpenAI weights.