初始化项目,由ModelHub XC社区提供模型
Model: QuixiAI/Llama-3.2-1B Source: Original Platform
This commit is contained in:
31
README.md
Normal file
31
README.md
Normal file
@@ -0,0 +1,31 @@
|
||||
---
|
||||
base_model: meta-llama/Llama-3.2-1B-Instruct
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
license: llama3.2
|
||||
tags:
|
||||
- llama-3
|
||||
- llama
|
||||
- meta
|
||||
- facebook
|
||||
- transformers
|
||||
---
|
||||
|
||||
Quantizing Llama-3.2-1B
|
||||
|
||||
Eric Hartford
|
||||
|
||||
I am creating several quants of Llama-3.1-1B for the purposes of testing vLLM Marlin.
|
||||
|
||||
- https://huggingface.co/QuixiAI/Llama-3.2-1B
|
||||
- https://huggingface.co/QuixiAI/Llama-3.2-1B-FP8-Dynamic
|
||||
- https://huggingface.co/QuixiAI/Llama-3.2-1B-MXFP4
|
||||
- https://huggingface.co/QuixiAI/Llama-3.2-1B-NVFP4A16
|
||||
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-AWQ
|
||||
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W4A16-GPTQ
|
||||
- https://huggingface.co/QuixiAI/Llama-3.2-1B-W8A16-GPTQ
|
||||
|
||||
The script I used to quant this:
|
||||
[quant.py](quant.py)
|
||||
|
||||
Reference in New Issue
Block a user