124 lines
3.6 KiB
Markdown
124 lines
3.6 KiB
Markdown
|
|
---
|
|||
|
|
license: apache-2.0
|
|||
|
|
base_model: Qwen/Qwen3-4B-Instruct-2507
|
|||
|
|
tags:
|
|||
|
|
- roleplay
|
|||
|
|
- creative-writing
|
|||
|
|
- sft
|
|||
|
|
- qwen3
|
|||
|
|
- heretic
|
|||
|
|
- uncensored
|
|||
|
|
- decensored
|
|||
|
|
- abliterated
|
|||
|
|
datasets:
|
|||
|
|
- flammenai/flame-kindling-v1
|
|||
|
|
language:
|
|||
|
|
- en
|
|||
|
|
pipeline_tag: text-generation
|
|||
|
|
library_name: transformers
|
|||
|
|
---
|
|||
|
|
# This is a decensored version of [Pranavz/qwen-4b-2507-rp-mahou](https://huggingface.co/Pranavz/qwen-4b-2507-rp-mahou), made using [Heretic](https://github.com/p-e-w/heretic) v1.2.0
|
|||
|
|
|
|||
|
|
## Abliteration parameters
|
|||
|
|
|
|||
|
|
| Parameter | Value |
|
|||
|
|
| :-------- | :---: |
|
|||
|
|
| **direction_index** | per layer |
|
|||
|
|
| **attn.o_proj.max_weight** | 3.59 |
|
|||
|
|
| **attn.o_proj.max_weight_position** | 32.93 |
|
|||
|
|
| **attn.o_proj.min_weight** | 3.24 |
|
|||
|
|
| **attn.o_proj.min_weight_distance** | 22.91 |
|
|||
|
|
| **mlp.down_proj.max_weight** | 2.25 |
|
|||
|
|
| **mlp.down_proj.max_weight_position** | 25.62 |
|
|||
|
|
| **mlp.down_proj.min_weight** | 3.23 |
|
|||
|
|
| **mlp.down_proj.min_weight_distance** | 19.06 |
|
|||
|
|
|
|||
|
|
## Performance
|
|||
|
|
|
|||
|
|
| Metric | This model | Original model ([Pranavz/qwen-4b-2507-rp-mahou](https://huggingface.co/Pranavz/qwen-4b-2507-rp-mahou)) |
|
|||
|
|
| :----- | :--------: | :---------------------------: |
|
|||
|
|
| **KL divergence** | 0.4197 | 0 *(by definition)* |
|
|||
|
|
| **Refusals** | 5/100 | 99/100 |
|
|||
|
|
|
|||
|
|
-----
|
|||
|
|
|
|||
|
|
|
|||
|
|
# qwen-4b-2507-rp-mahou
|
|||
|
|
|
|||
|
|
A full-parameter SFT of [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) on [`flammenai/flame-kindling-v1`](https://huggingface.co/datasets/flammenai/flame-kindling-v1) for creative roleplay and character interaction.
|
|||
|
|
|
|||
|
|
## Highlights
|
|||
|
|
|
|||
|
|
- Base: Qwen3-4B-Instruct-2507
|
|||
|
|
- Method: full-sequence SFT (no LoRA)
|
|||
|
|
- Dataset: flame-kindling-v1 (RP / creative writing)
|
|||
|
|
- Precision: bf16
|
|||
|
|
- Chat template: Qwen3 (use `enable_thinking=False` for RP)
|
|||
|
|
|
|||
|
|
## Usage
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
import torch
|
|||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|||
|
|
|
|||
|
|
MODEL_ID = "Pranavz/qwen-4b-2507-rp-mahou"
|
|||
|
|
|
|||
|
|
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
|
|||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
|||
|
|
MODEL_ID,
|
|||
|
|
torch_dtype=torch.bfloat16,
|
|||
|
|
device_map="auto",
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
messages = [
|
|||
|
|
{"role": "system", "content": "You are a creative roleplay assistant. Stay in character, write vividly, and use asterisks for actions."},
|
|||
|
|
{"role": "user", "content": "*walks into the tavern, shaking off the rain* Evening, barkeep. Got a room?"},
|
|||
|
|
]
|
|||
|
|
|
|||
|
|
text = tokenizer.apply_chat_template(
|
|||
|
|
messages,
|
|||
|
|
tokenize=False,
|
|||
|
|
add_generation_prompt=True,
|
|||
|
|
enable_thinking=False,
|
|||
|
|
)
|
|||
|
|
inputs = tokenizer(text, return_tensors="pt").to(model.device)
|
|||
|
|
|
|||
|
|
with torch.inference_mode():
|
|||
|
|
out = model.generate(
|
|||
|
|
**inputs,
|
|||
|
|
max_new_tokens=512,
|
|||
|
|
temperature=0.8,
|
|||
|
|
top_p=0.9,
|
|||
|
|
top_k=40,
|
|||
|
|
repetition_penalty=1.1,
|
|||
|
|
do_sample=True,
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Recommended sampler settings
|
|||
|
|
|
|||
|
|
| Parameter | Value | Notes |
|
|||
|
|
|---|---|---|
|
|||
|
|
| `temperature` | 0.7 – 0.85 | creative without going off-rails |
|
|||
|
|
| `top_p` | 0.9 | trim the long tail |
|
|||
|
|
| `top_k` | 40 | hard vocab cap |
|
|||
|
|
| `min_p` | 0.05 | optional, often nicer than top_p alone |
|
|||
|
|
| `repetition_penalty` | 1.05 – 1.15 | RP models love loops — kill them |
|
|||
|
|
| `max_new_tokens` | 512 – 1024 | RP needs room |
|
|||
|
|
|
|||
|
|
Always pass `enable_thinking=False` to the chat template — RP doesn't want CoT.
|
|||
|
|
|
|||
|
|
|
|||
|
|
## Limitations
|
|||
|
|
|
|||
|
|
- Trained on a single curated RP dataset; expect a particular tone (vivid, action-asterisk style)
|
|||
|
|
- Not safety-tuned beyond what the base model provides
|
|||
|
|
- English only
|
|||
|
|
|
|||
|
|
## Acknowledgements
|
|||
|
|
|
|||
|
|
- Base model: [Qwen team](https://huggingface.co/Qwen)
|
|||
|
|
- Dataset: [flammenai/flame-kindling-v1](https://huggingface.co/datasets/flammenai/flame-kindling-v1)
|