Files
ModelHub XC 615dd026f0 初始化项目,由ModelHub XC社区提供模型
Model: kusakana/ReasoningQAT-Qwen3-1.7B-2bit
Source: Original Platform
2026-05-10 09:25:37 +08:00

35 lines
1.0 KiB
Markdown

---
license: apache-2.0
base_model: Qwen/Qwen3-1.7B
tags:
- quantization
- reasoning
- qat
language:
- en
---
# ReasoningQAT-Qwen3-1.7B-2bit
This model is a **2-bit pseudo-quantized** version of [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), trained with Quantization-Aware Training (QAT) for reasoning tasks.
## Details
- **Base model:** Qwen3-1.7B
- **Quantization:** W2G128 (2-bit weights, group size 128)
- **Format:** Pseudo-quantized (stored in FP16; weights lie on 2-bit quantization grids)
- **Method:** ReasoningQAT — QAT combining knowledge distillation with teacher-confidence-weighted DFT loss, trained end-to-end on reasoning data
## Citation
```bibtex
@inproceedings{
okoshi2026towards,
title={Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning {LLM}s},
author={Yasuyuki Okoshi and Hikari Otsuka and Daichi Fujiki and Masato Motomura},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Azsd2qyK6C}
}
```