Files
Llama-3.2-1B/README.md
ModelHub XC 5f7e251fae 初始化项目,由ModelHub XC社区提供模型
Model: QuixiAI/Llama-3.2-1B
Source: Original Platform
2026-06-02 19:33:12 +08:00

32 lines
735 B
Markdown

---
base_model: meta-llama/Llama-3.2-1B-Instruct
language:
- en
library_name: transformers
license: llama3.2
tags:
- llama-3
- llama
- meta
- facebook
- transformers
---
Quantizing Llama-3.2-1B
Eric Hartford
I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.
- https://huggingface.co/QuixiAI/Llama-3.2-1B
- https://huggingface.co/QuixiAI/Llama-3.2-1B-FP8-Dynamic
- https://huggingface.co/QuixiAI/Llama-3.2-1B-MXFP4
- https://huggingface.co/QuixiAI/Llama-3.2-1B-NVFP4A16
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-AWQ
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-GPTQ
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W8A16-GPTQ
The script I used to quant this:
[quant.py](quant.py)