Files
Qwen3-8B-Instruct-2512-SFT/README.md
ModelHub XC 66b3dd1ddb 初始化项目,由ModelHub XC社区提供模型
Model: JunHowie/Qwen3-8B-Instruct-2512-SFT
Source: Original Platform
2026-05-19 07:32:30 +08:00

119 lines
7.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-8B
---
# Qwen3-8B-Instruct-2512-SFT
**NOTEThis model is the Instruct-aligned variant, and it will not generate ``<think></think>`` blocks in its outputs.
Additionally, there is no need to specify enable_thinking=False anymore.**
Among them, the 8B and 14B SFT and DFT variants are obtained via full-parameter fine-tuning, while the 32B models are trained using LoRA due to hardware resource constraints.<br>
The dataset used is the Chinese Distillation Dataset based on Qwen3-235B-2507.available at:[**Chinese-Qwen3-235B-2507-Distill-data-110k**](https://www.modelscope.cn/datasets/swift/Chinese-Qwen3-235B-2507-Distill-data-110k)
<br>
For details and code regarding model training and quantization, please see[Training and Quantization Guide](https://www.modelscope.cn/learn/3000)
<br>
Here is the list of models released in this version:<br>
<table border="1" cellpadding="6" cellspacing="0" style="border-collapse: collapse; font-family: sans-serif;">
<thead>
<tr style="background-color: #f0f0f0;">
<th rowspan="2">Model</th>
<th colspan="2">4-bit AWQ</th>
<th rowspan="2">8-bit FP8</th>
<th colspan="2">GPTQ</th>
<th colspan="2">NVIDIA FP4</th>
<th colspan="2">Weight-Activation</th>
</tr>
<tr style="background-color: #f8f8f8;">
<th>AWQ</th>
<th>AWQ-asym</th>
<th>INT4</th>
<th>INT8</th>
<th>NVFP4</th>
<th>NVFP4-A16</th>
<th>W4A16</th>
<th>W8A8</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT">Qwen3-8B-Instruct-2512-DFT</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-AWQ">AWQ</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-awq-asym">awq-asym</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-FP8">FP8</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-GPTQ-Int4">GPTQ(int4)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-GPTQ-Int8">GPTQ(int8)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-NVFP4">NVFP4</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-NVFP4A16">NVFP4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-W4A16">W4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-W8A8">W8A8</a></td>
</tr>
<tr>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT">Qwen3-8B-Instruct-2512-SFT</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-AWQ">AWQ</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-awq-asym">awq-asym</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-FP8">FP8</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-GPTQ-Int4">GPTQ(int4)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-GPTQ-Int8">GPTQ(int8)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-NVFP4">NVFP4</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-NVFP4A16">NVFP4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-W4A16">W4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-W8A8">W8A8</a></td>
</tr>
<tr>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT">Qwen3-14B-Instruct-2512-DFT</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-AWQ">AWQ</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-awq-asym">awq-asym</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-FP8">FP8</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-GPTQ-Int4">GPTQ(int4)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-GPTQ-Int8">GPTQ(int8)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-NVFP4">NVFP4</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-NVFP4A16">NVFP4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-W4A16">W4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-W8A8">W8A8</a></td>
</tr>
<tr>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT">Qwen3-14B-Instruct-2512-SFT</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-AWQ">AWQ</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-awq-asym">awq-asym</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-FP8">FP8</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-GPTQ-Int4">GPTQ(int4)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-GPTQ-Int8">GPTQ(int8)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-NVFP4">NVFP4</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-NVFP4A16">NVFP4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-W4A16">W4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-W8A8">W8A8</a></td>
</tr>
<tr>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT">Qwen3-32B-Instruct-2512-DFT</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-AWQ">AWQ</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-awq-asym">awq-asym</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-FP8">FP8</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-GPTQ-Int4">GPTQ(int4)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-GPTQ-Int8">GPTQ(int8)</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-NVFP4">NVFP4</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-NVFP4A16">NVFP4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-W4A16">W4A16</a></td>
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-W8A8">W8A8</a></td>
</tr>
</tbody>
</table>
### 【Dependencies】
```
vllm>=0.10.2
transformers>=4.56.1
```