119 lines
7.4 KiB
Markdown
119 lines
7.4 KiB
Markdown
---
|
||
library_name: transformers
|
||
license: apache-2.0
|
||
license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE
|
||
pipeline_tag: text-generation
|
||
base_model:
|
||
- Qwen/Qwen3-8B
|
||
---
|
||
# Qwen3-8B-Instruct-2512-SFT
|
||
|
||
|
||
**NOTE:This model is the Instruct-aligned variant, and it will not generate ``<think></think>`` blocks in its outputs.
|
||
Additionally, there is no need to specify enable_thinking=False anymore.**
|
||
|
||
|
||
|
||
Among them, the 8B and 14B SFT and DFT variants are obtained via full-parameter fine-tuning, while the 32B models are trained using LoRA due to hardware resource constraints.<br>
|
||
The dataset used is the Chinese Distillation Dataset based on Qwen3-235B-2507.available at:[**Chinese-Qwen3-235B-2507-Distill-data-110k**](https://www.modelscope.cn/datasets/swift/Chinese-Qwen3-235B-2507-Distill-data-110k)
|
||
<br>
|
||
For details and code regarding model training and quantization, please see[Training and Quantization Guide](https://www.modelscope.cn/learn/3000)
|
||
<br>
|
||
Here is the list of models released in this version:<br>
|
||
|
||
<table border="1" cellpadding="6" cellspacing="0" style="border-collapse: collapse; font-family: sans-serif;">
|
||
<thead>
|
||
<tr style="background-color: #f0f0f0;">
|
||
<th rowspan="2">Model</th>
|
||
<th colspan="2">4-bit AWQ</th>
|
||
<th rowspan="2">8-bit FP8</th>
|
||
<th colspan="2">GPTQ</th>
|
||
<th colspan="2">NVIDIA FP4</th>
|
||
<th colspan="2">Weight-Activation</th>
|
||
</tr>
|
||
<tr style="background-color: #f8f8f8;">
|
||
<th>AWQ</th>
|
||
<th>AWQ-asym</th>
|
||
<th>INT4</th>
|
||
<th>INT8</th>
|
||
<th>NVFP4</th>
|
||
<th>NVFP4-A16</th>
|
||
<th>W4A16</th>
|
||
<th>W8A8</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT">Qwen3-8B-Instruct-2512-DFT</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-AWQ">AWQ</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-awq-asym">awq-asym</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-FP8">FP8</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-GPTQ-Int4">GPTQ(int4)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-GPTQ-Int8">GPTQ(int8)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-NVFP4">NVFP4</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-NVFP4A16">NVFP4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-W4A16">W4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-DFT-W8A8">W8A8</a></td>
|
||
</tr>
|
||
<tr>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT">Qwen3-8B-Instruct-2512-SFT</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-AWQ">AWQ</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-awq-asym">awq-asym</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-FP8">FP8</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-GPTQ-Int4">GPTQ(int4)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-GPTQ-Int8">GPTQ(int8)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-NVFP4">NVFP4</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-NVFP4A16">NVFP4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-W4A16">W4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-8B-Instruct-2512-SFT-W8A8">W8A8</a></td>
|
||
</tr>
|
||
<tr>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT">Qwen3-14B-Instruct-2512-DFT</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-AWQ">AWQ</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-awq-asym">awq-asym</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-FP8">FP8</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-GPTQ-Int4">GPTQ(int4)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-GPTQ-Int8">GPTQ(int8)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-NVFP4">NVFP4</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-NVFP4A16">NVFP4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-W4A16">W4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-DFT-W8A8">W8A8</a></td>
|
||
</tr>
|
||
<tr>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT">Qwen3-14B-Instruct-2512-SFT</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-AWQ">AWQ</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-awq-asym">awq-asym</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-FP8">FP8</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-GPTQ-Int4">GPTQ(int4)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-GPTQ-Int8">GPTQ(int8)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-NVFP4">NVFP4</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-NVFP4A16">NVFP4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-W4A16">W4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-14B-Instruct-2512-SFT-W8A8">W8A8</a></td>
|
||
</tr>
|
||
<tr>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT">Qwen3-32B-Instruct-2512-DFT</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-AWQ">AWQ</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-awq-asym">awq-asym</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-FP8">FP8</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-GPTQ-Int4">GPTQ(int4)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-GPTQ-Int8">GPTQ(int8)</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-NVFP4">NVFP4</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-NVFP4A16">NVFP4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-W4A16">W4A16</a></td>
|
||
<td><a href="https://www.modelscope.cn/models/JunHowie/Qwen3-32B-Instruct-2512-DFT-W8A8">W8A8</a></td>
|
||
</tr>
|
||
</tbody>
|
||
</table>
|
||
|
||
|
||
|
||
### 【Dependencies】
|
||
```
|
||
vllm>=0.10.2
|
||
transformers>=4.56.1
|
||
```
|
||
|
||
|
||
|