36 lines
820 B
Markdown
36 lines
820 B
Markdown
|
|
---
|
||
|
|
license: llama2
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
library_name: transformers
|
||
|
|
tags:
|
||
|
|
- facebook
|
||
|
|
- meta
|
||
|
|
- pytorch
|
||
|
|
- llama
|
||
|
|
- llama-2
|
||
|
|
- 4bit
|
||
|
|
- gptq
|
||
|
|
base_model: meta-llama/LlamaGuard-7b
|
||
|
|
inference: false
|
||
|
|
---
|
||
|
|
|
||
|
|
# Quantized version of meta-llama/LlamaGuard-7b
|
||
|
|
|
||
|
|
## Model Description
|
||
|
|
|
||
|
|
The model [meta-llama/LlamaGuard-7b](https://huggingface.co/meta-llama/LlamaGuard-7b) was quantized to 4bit, group_size 128, and act-order=True with auto-gptq integration in transformers (https://huggingface.co/blog/gptq-integration).
|
||
|
|
|
||
|
|
## Evaluation
|
||
|
|
|
||
|
|
To evaluate the qunatized model and compare it with the full precision model, I performed binary classification on the "toxicity" label from the ~5k samples test set of lmsys/toxic-chat.
|
||
|
|
|
||
|
|
📊 Full Precision Model:
|
||
|
|
|
||
|
|
Average Precision Score: 0.3625
|
||
|
|
|
||
|
|
📊 4-bit Quantized Model:
|
||
|
|
|
||
|
|
Average Precision Score: 0.3450
|
||
|
|
|