5f7e251faec4c5144f0a9d4da58175a95fd28e02
Model: QuixiAI/Llama-3.2-1B Source: Original Platform
base_model, language, library_name, license, tags
| base_model | language | library_name | license | tags | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| meta-llama/Llama-3.2-1B-Instruct |
|
transformers | llama3.2 |
|
Quantizing Llama-3.2-1B
Eric Hartford
I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.
- https://huggingface.co/QuixiAI/Llama-3.2-1B
- https://huggingface.co/QuixiAI/Llama-3.2-1B-FP8-Dynamic
- https://huggingface.co/QuixiAI/Llama-3.2-1B-MXFP4
- https://huggingface.co/QuixiAI/Llama-3.2-1B-NVFP4A16
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-AWQ
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-GPTQ
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W8A16-GPTQ
The script I used to quant this: quant.py
Description
Languages
Python
54.1%
Jinja
45.9%