Model: Pranavz/qwen-4b-2507-rp-mahou Source: Original Platform
license, base_model, tags, datasets, language, pipeline_tag, library_name
| license | base_model | tags | datasets | language | pipeline_tag | library_name | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| apache-2.0 | Qwen/Qwen3-4B-Instruct-2507 |
|
|
|
text-generation | transformers |
qwen-4b-2507-rp-mahou
A full-parameter SFT of Qwen/Qwen3-4B-Instruct-2507 on flammenai/flame-kindling-v1 for creative roleplay and character interaction.
Highlights
- Base: Qwen3-4B-Instruct-2507
- Method: full-sequence SFT (no LoRA)
- Dataset: flame-kindling-v1 (RP / creative writing)
- Precision: bf16
- Chat template: Qwen3 (use
enable_thinking=Falsefor RP)
Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_ID = "Pranavz/qwen-4b-2507-rp-mahou"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a creative roleplay assistant. Stay in character, write vividly, and use asterisks for actions."},
{"role": "user", "content": "*walks into the tavern, shaking off the rain* Evening, barkeep. Got a room?"},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.inference_mode():
out = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.8,
top_p=0.9,
top_k=40,
repetition_penalty=1.1,
do_sample=True,
)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Recommended sampler settings
| Parameter | Value | Notes |
|---|---|---|
temperature |
0.7 – 0.85 | creative without going off-rails |
top_p |
0.9 | trim the long tail |
top_k |
40 | hard vocab cap |
min_p |
0.05 | optional, often nicer than top_p alone |
repetition_penalty |
1.05 – 1.15 | RP models love loops — kill them |
max_new_tokens |
512 – 1024 | RP needs room |
Always pass enable_thinking=False to the chat template — RP doesn't want CoT.
Limitations
- Trained on a single curated RP dataset; expect a particular tone (vivid, action-asterisk style)
- Not safety-tuned beyond what the base model provides
- English only
Acknowledgements
- Base model: Qwen team
- Dataset: flammenai/flame-kindling-v1
Description