Files
ReasoningQAT-Qwen3-1.7B-3bit/README.md
ModelHub XC c01c94eefe 初始化项目,由ModelHub XC社区提供模型
Model: kusakana/ReasoningQAT-Qwen3-1.7B-3bit
Source: Original Platform
2026-05-10 08:49:08 +08:00

35 lines
1.0 KiB
Markdown

---
license: apache-2.0
base_model: Qwen/Qwen3-1.7B
tags:
- quantization
- reasoning
- qat
language:
- en
---
# ReasoningQAT-Qwen3-1.7B-3bit
This model is a **3-bit pseudo-quantized** version of [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), trained with Quantization-Aware Training (QAT) for reasoning tasks.
## Details
- **Base model:** Qwen3-1.7B
- **Quantization:** W3G128 (3-bit weights, group size 128)
- **Format:** Pseudo-quantized (stored in FP16; weights lie on 3-bit quantization grids)
- **Method:** ReasoningQAT — QAT combining knowledge distillation with teacher-confidence-weighted DFT loss, trained end-to-end on reasoning data
## Citation
```bibtex
@inproceedings{
okoshi2026towards,
title={Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning {LLM}s},
author={Yasuyuki Okoshi and Hikari Otsuka and Daichi Fujiki and Masato Motomura},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Azsd2qyK6C}
}
```