base_model, pipeline_tag, tags, license, license_link
base_model pipeline_tag tags license license_link
Qwen/Qwen3-1.7B text-generation
qwen3
edgerazor
quantization
apache-2.0 https://huggingface.co/Qwen/Qwen3-1.7B/blob/main/LICENSE

EdgeRazor Logo

EdgeRazor for Lightweight LLMs

arXiv EdgeRazor GitHub EdgeRazor PyPI EdgeRazor

Contents

Model Overview

Model Bit-Widths

Mixed-Precision Recipe Bit-Width This Repo GGUF Type
100% 4-bit + 0% 1.58-bit 4 ✔️ Q4_0
50% 4-bit + 50% 1.58-bit 2.79 ✖️ Not supported
12.5% 4-bit + 87.5% 1.58-bit 1.88 ✖️ Not supported
0% 4-bit + 100% 1.58-bit 1.58 ✔️ TQ1_0, TQ2_0

Get Started

Use llama.cpp to conduct efficient inference on edge devices.

Check the cli.sh script for basic usage.

Model list:

Citation

If you find our project useful in your research, please consider kindly citing our papers ✏️:

@article{zhangsh-edgerazor,
  title={{EdgeRazor}: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation},
  author={Shu-Hao Zhang and Le-Tong Huang and Xiang-Sheng Deng and Xin-Yi Zou and Chen Wu and Nan Li and Shao-Qun Zhang},
  year={2026},
  journal={arXiv preprint arXiv:2605.04062}
}
Description
Model synced from source: zhangsq-nju/Qwen3-1.7B-EdgeRazor-GGUF
Readme 149 KiB
Languages
Shell 100%