Llama-3.2-1B/README.md

---
base_model: meta-llama/Llama-3.2-1B-Instruct
language:
- en
library_name: transformers
license: llama3.2
tags:
- llama-3
- llama
- meta
- facebook
- transformers
---

Quantizing Llama-3.2-1B

Eric Hartford

I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.

- https://huggingface.co/QuixiAI/Llama-3.2-1B
- https://huggingface.co/QuixiAI/Llama-3.2-1B-FP8-Dynamic
- https://huggingface.co/QuixiAI/Llama-3.2-1B-MXFP4
- https://huggingface.co/QuixiAI/Llama-3.2-1B-NVFP4A16
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-AWQ
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-GPTQ
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W8A16-GPTQ

The script I used to quant this:
[quant.py](quant.py)
初始化项目，由ModelHub XC社区提供模型 Model: QuixiAI/Llama-3.2-1B Source: Original Platform 2026-06-02 19:33:12 +08:00			`---`
			`base_model: meta-llama/Llama-3.2-1B-Instruct`
			`language:`
			`- en`
			`library_name: transformers`
			`license: llama3.2`
			`tags:`
			`- llama-3`
			`- llama`
			`- meta`
			`- facebook`
			`- transformers`
			`---`

			`Quantizing Llama-3.2-1B`

			`Eric Hartford`

			`I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.`

			`- https://huggingface.co/QuixiAI/Llama-3.2-1B`
			`- https://huggingface.co/QuixiAI/Llama-3.2-1B-FP8-Dynamic`
			`- https://huggingface.co/QuixiAI/Llama-3.2-1B-MXFP4`
			`- https://huggingface.co/QuixiAI/Llama-3.2-1B-NVFP4A16`
			`- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-AWQ`
			`- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-GPTQ`
			`- https://huggingface.co/QuixiAI/Llama-3.2-1B-W8A16-GPTQ`

			`The script I used to quant this:`
			`[quant.py](quant.py)`