53 lines
1.3 KiB
Markdown
53 lines
1.3 KiB
Markdown
---
|
|
license: apache-2.0
|
|
base_model: h2oai/h2o-danube-1.8b-chat
|
|
tags:
|
|
- uncensored
|
|
- abliterated
|
|
- gguf
|
|
- h2o
|
|
- conversational
|
|
pipeline_tag: text-generation
|
|
---
|
|
|
|
# h2o-danube-1.8b-uncensored
|
|
|
|
Uncensored variant of [h2oai/h2o-danube-1.8b-chat](https://huggingface.co/h2oai/h2o-danube-1.8b-chat).
|
|
|
|
## Method
|
|
|
|
1. **Abliteration** (strength=0.2) — refusal direction removed from all layers
|
|
2. **LoRA fine-tune** on `Guilherme34/uncensor` (2 epochs, r=16, alpha=32)
|
|
3. **Re-abliteration** (strength=0.35) — stronger pass to remove residual refusals
|
|
|
|
## Eval Results
|
|
|
|
| Split | Refused |
|
|
|-------|---------|
|
|
| Harmful (64 prompts) | 1/64 |
|
|
| Harmless (64 prompts) | 0/64 |
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
llama-cli -m h2o_danube_1.8b_uncensored.Q4_K_M.gguf -p "Your prompt here"
|
|
```
|
|
|
|
## Training Config
|
|
|
|
| Parameter | Value |
|
|
|-----------|-------|
|
|
| Base Model | h2oai/h2o-danube-1.8b-chat |
|
|
| Fine-tune Dataset | Guilherme34/uncensor |
|
|
| Epochs | 2 |
|
|
| LoRA r | 16 |
|
|
| LoRA alpha | 32 |
|
|
| Learning Rate | 0.0002 |
|
|
| Abliteration Strength | 0.2 |
|
|
| Re-abliteration Strength | 0.35 |
|
|
|
|
## Credits
|
|
|
|
- Abliteration technique: [andyrdt/refusal_direction](https://github.com/andyrdt/refusal_direction)
|
|
- Weight editing: [Sumandora/remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers)
|