54 lines
2.0 KiB
Markdown
54 lines
2.0 KiB
Markdown
---
|
|
base_model: meta-llama/Llama-3.2-3B
|
|
language:
|
|
- en
|
|
license: llama3.2
|
|
tags:
|
|
- uncensored
|
|
- gguf
|
|
- llama-cpp
|
|
- sft
|
|
pipeline_tag: text-generation
|
|
---
|
|
|
|
# Llama-3.2-3B-Uncensored
|
|
|
|
This repository contains the raw **Safetensors** weights for an uncensored variant of Llama-3.2-3B. This model is optimized for edge deployment and fast inference while completely bypassing standard RLHF refusal mechanisms.
|
|
|
|
## Why Uncensored?
|
|
Consumer AI models are heavily filtered, which often blocks legitimate academic research, complex creative writing, and sovereign data analysis. By utilizing orthogonalization and abliteration techniques, the "refusal" vectors in this model have been erased.
|
|
|
|
We kept this model entirely open and uncensored so that researchers, legal tech developers, and sovereign AI builders have a blank-slate reasoning engine that obeys the user, not a cloud provider's safety policy.
|
|
|
|
## Format Note: Safetensors vs GGUF
|
|
This specific repository hosts the multi-part `.safetensors` files (as seen in the Files tab).
|
|
* If you are looking for the **Ollama-ready GGUF version**, please navigate to the `prawinin/Llama-3.2-3B-Uncensored-Q8_0-GGUF` repository.
|
|
* The weights in *this* repository are meant for developers building custom pipelines or doing further fine-tuning.
|
|
|
|
## How to Use (Python / Transformers)
|
|
|
|
```python
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
import torch
|
|
|
|
model_id = "prawinin/Llama-3.2-3B-Uncensored"
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
|
model = AutoModelForCausalLM.from_pretrained(
|
|
model_id,
|
|
torch_dtype=torch.bfloat16,
|
|
device_map="auto"
|
|
)
|
|
|
|
prompt = "Explain the physiological effects of severe sleep deprivation on the human brain."
|
|
messages = [
|
|
{"role": "user", "content": prompt}
|
|
]
|
|
|
|
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
|
inputs = tokenizer([text], return_tensors="pt").to("cuda")
|
|
|
|
outputs = model.generate(**inputs, max_new_tokens=512)
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
|