Model: kusakana/ReasoningQAT-Qwen3-1.7B-2bit Source: Original Platform
license, base_model, tags, language
| license | base_model | tags | language | ||||
|---|---|---|---|---|---|---|---|
| apache-2.0 | Qwen/Qwen3-1.7B |
|
|
ReasoningQAT-Qwen3-1.7B-2bit
This model is a 2-bit pseudo-quantized version of Qwen3-1.7B, trained with Quantization-Aware Training (QAT) for reasoning tasks.
Details
- Base model: Qwen3-1.7B
- Quantization: W2G128 (2-bit weights, group size 128)
- Format: Pseudo-quantized (stored in FP16; weights lie on 2-bit quantization grids)
- Method: ReasoningQAT — QAT combining knowledge distillation with teacher-confidence-weighted DFT loss, trained end-to-end on reasoning data
Citation
@inproceedings{
okoshi2026towards,
title={Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning {LLM}s},
author={Yasuyuki Okoshi and Hikari Otsuka and Daichi Fujiki and Masato Motomura},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Azsd2qyK6C}
}
Description
Languages
Jinja
100%