初始化项目,由ModelHub XC社区提供模型
Model: kusakana/ReasoningQAT-Qwen3-1.7B-3bit Source: Original Platform
This commit is contained in:
34
README.md
Normal file
34
README.md
Normal file
@@ -0,0 +1,34 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
base_model: Qwen/Qwen3-1.7B
|
||||
tags:
|
||||
- quantization
|
||||
- reasoning
|
||||
- qat
|
||||
language:
|
||||
- en
|
||||
---
|
||||
|
||||
# ReasoningQAT-Qwen3-1.7B-3bit
|
||||
|
||||
This model is a **3-bit pseudo-quantized** version of [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), trained with Quantization-Aware Training (QAT) for reasoning tasks.
|
||||
|
||||
## Details
|
||||
|
||||
- **Base model:** Qwen3-1.7B
|
||||
- **Quantization:** W3G128 (3-bit weights, group size 128)
|
||||
- **Format:** Pseudo-quantized (stored in FP16; weights lie on 3-bit quantization grids)
|
||||
- **Method:** ReasoningQAT — QAT combining knowledge distillation with teacher-confidence-weighted DFT loss, trained end-to-end on reasoning data
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@inproceedings{
|
||||
okoshi2026towards,
|
||||
title={Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning {LLM}s},
|
||||
author={Yasuyuki Okoshi and Hikari Otsuka and Daichi Fujiki and Masato Motomura},
|
||||
booktitle={The Fourteenth International Conference on Learning Representations},
|
||||
year={2026},
|
||||
url={https://openreview.net/forum?id=Azsd2qyK6C}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user