--- license: apache-2.0 base_model: Qwen/Qwen3-1.7B tags: - quantization - reasoning - qat language: - en --- # ReasoningQAT-Qwen3-1.7B-3bit This model is a **3-bit pseudo-quantized** version of [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), trained with Quantization-Aware Training (QAT) for reasoning tasks. ## Details - **Base model:** Qwen3-1.7B - **Quantization:** W3G128 (3-bit weights, group size 128) - **Format:** Pseudo-quantized (stored in FP16; weights lie on 3-bit quantization grids) - **Method:** ReasoningQAT — QAT combining knowledge distillation with teacher-confidence-weighted DFT loss, trained end-to-end on reasoning data ## Citation ```bibtex @inproceedings{ okoshi2026towards, title={Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning {LLM}s}, author={Yasuyuki Okoshi and Hikari Otsuka and Daichi Fujiki and Masato Motomura}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=Azsd2qyK6C} } ```