初始化项目,由ModelHub XC社区提供模型

Model: kusakana/ReasoningQAT-Qwen3-1.7B-2bit
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-10 09:25:37 +08:00
commit 615dd026f0
12 changed files with 151859 additions and 0 deletions

34
README.md Normal file
View File

@@ -0,0 +1,34 @@
---
license: apache-2.0
base_model: Qwen/Qwen3-1.7B
tags:
- quantization
- reasoning
- qat
language:
- en
---
# ReasoningQAT-Qwen3-1.7B-2bit
This model is a **2-bit pseudo-quantized** version of [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), trained with Quantization-Aware Training (QAT) for reasoning tasks.
## Details
- **Base model:** Qwen3-1.7B
- **Quantization:** W2G128 (2-bit weights, group size 128)
- **Format:** Pseudo-quantized (stored in FP16; weights lie on 2-bit quantization grids)
- **Method:** ReasoningQAT — QAT combining knowledge distillation with teacher-confidence-weighted DFT loss, trained end-to-end on reasoning data
## Citation
```bibtex
@inproceedings{
okoshi2026towards,
title={Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning {LLM}s},
author={Yasuyuki Okoshi and Hikari Otsuka and Daichi Fujiki and Masato Motomura},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Azsd2qyK6C}
}
```