53 lines
1.3 KiB
Markdown
53 lines
1.3 KiB
Markdown
|
|
---
|
||
|
|
license: apache-2.0
|
||
|
|
base_model: microsoft/Phi-4-mini-instruct
|
||
|
|
tags:
|
||
|
|
- uncensored
|
||
|
|
- abliterated
|
||
|
|
- gguf
|
||
|
|
- phi
|
||
|
|
- conversational
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
---
|
||
|
|
|
||
|
|
# phi-4-mini-uncensored
|
||
|
|
|
||
|
|
Uncensored variant of [microsoft/Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct).
|
||
|
|
|
||
|
|
## Method
|
||
|
|
|
||
|
|
1. **Abliteration** (strength=0.2) — refusal direction removed from all layers
|
||
|
|
2. **LoRA fine-tune** on `Guilherme34/uncensor` (2 epochs, r=16, alpha=32)
|
||
|
|
3. **Re-abliteration** (strength=0.35) — stronger pass to remove residual refusals
|
||
|
|
|
||
|
|
## Eval Results
|
||
|
|
|
||
|
|
| Split | Refused |
|
||
|
|
|-------|---------|
|
||
|
|
| Harmful (64 prompts) | 0/64 |
|
||
|
|
| Harmless (64 prompts) | 0/64 |
|
||
|
|
|
||
|
|
## Usage
|
||
|
|
|
||
|
|
```bash
|
||
|
|
llama-cli -m phi_4_mini_uncensored.Q4_K_M.gguf -p "Your prompt here"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Training Config
|
||
|
|
|
||
|
|
| Parameter | Value |
|
||
|
|
|-----------|-------|
|
||
|
|
| Base Model | microsoft/Phi-4-mini-instruct |
|
||
|
|
| Fine-tune Dataset | Guilherme34/uncensor |
|
||
|
|
| Epochs | 2 |
|
||
|
|
| LoRA r | 16 |
|
||
|
|
| LoRA alpha | 32 |
|
||
|
|
| Learning Rate | 0.0002 |
|
||
|
|
| Abliteration Strength | 0.2 |
|
||
|
|
| Re-abliteration Strength | 0.35 |
|
||
|
|
|
||
|
|
## Credits
|
||
|
|
|
||
|
|
- Abliteration technique: [andyrdt/refusal_direction](https://github.com/andyrdt/refusal_direction)
|
||
|
|
- Weight editing: [Sumandora/remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers)
|