---
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-8B
---
# Qwen3-8B-Instruct-2512-SFT
**NOTE:This model is the Instruct-aligned variant, and it will not generate ``
The dataset used is the Chinese Distillation Dataset based on Qwen3-235B-2507.available at:[**Chinese-Qwen3-235B-2507-Distill-data-110k**](https://www.modelscope.cn/datasets/swift/Chinese-Qwen3-235B-2507-Distill-data-110k)
For details and code regarding model training and quantization, please see[Training and Quantization Guide](https://www.modelscope.cn/learn/3000)
Here is the list of models released in this version:
| Model | 4-bit AWQ | 8-bit FP8 | GPTQ | NVIDIA FP4 | Weight-Activation | ||||
|---|---|---|---|---|---|---|---|---|---|
| AWQ | AWQ-asym | INT4 | INT8 | NVFP4 | NVFP4-A16 | W4A16 | W8A8 | ||
| Qwen3-8B-Instruct-2512-DFT | AWQ | awq-asym | FP8 | GPTQ(int4) | GPTQ(int8) | NVFP4 | NVFP4A16 | W4A16 | W8A8 |
| Qwen3-8B-Instruct-2512-SFT | AWQ | awq-asym | FP8 | GPTQ(int4) | GPTQ(int8) | NVFP4 | NVFP4A16 | W4A16 | W8A8 |
| Qwen3-14B-Instruct-2512-DFT | AWQ | awq-asym | FP8 | GPTQ(int4) | GPTQ(int8) | NVFP4 | NVFP4A16 | W4A16 | W8A8 |
| Qwen3-14B-Instruct-2512-SFT | AWQ | awq-asym | FP8 | GPTQ(int4) | GPTQ(int8) | NVFP4 | NVFP4A16 | W4A16 | W8A8 |
| Qwen3-32B-Instruct-2512-DFT | AWQ | awq-asym | FP8 | GPTQ(int4) | GPTQ(int8) | NVFP4 | NVFP4A16 | W4A16 | W8A8 |