Files
qwen-4b-2507-rp-mahou/README.md
ModelHub XC c2e3f0b8bd 初始化项目,由ModelHub XC社区提供模型
Model: Pranavz/qwen-4b-2507-rp-mahou
Source: Original Platform
2026-06-16 07:47:16 +08:00

95 lines
2.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: apache-2.0
base_model: Qwen/Qwen3-4B-Instruct-2507
tags:
- roleplay
- creative-writing
- sft
- qwen3
datasets:
- flammenai/flame-kindling-v1
language:
- en
pipeline_tag: text-generation
library_name: transformers
---
# qwen-4b-2507-rp-mahou
A full-parameter SFT of [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) on [`flammenai/flame-kindling-v1`](https://huggingface.co/datasets/flammenai/flame-kindling-v1) for creative roleplay and character interaction.
## Highlights
- Base: Qwen3-4B-Instruct-2507
- Method: full-sequence SFT (no LoRA)
- Dataset: flame-kindling-v1 (RP / creative writing)
- Precision: bf16
- Chat template: Qwen3 (use `enable_thinking=False` for RP)
## Usage
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_ID = "Pranavz/qwen-4b-2507-rp-mahou"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a creative roleplay assistant. Stay in character, write vividly, and use asterisks for actions."},
{"role": "user", "content": "*walks into the tavern, shaking off the rain* Evening, barkeep. Got a room?"},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.inference_mode():
out = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.8,
top_p=0.9,
top_k=40,
repetition_penalty=1.1,
do_sample=True,
)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
```
## Recommended sampler settings
| Parameter | Value | Notes |
|---|---|---|
| `temperature` | 0.7 0.85 | creative without going off-rails |
| `top_p` | 0.9 | trim the long tail |
| `top_k` | 40 | hard vocab cap |
| `min_p` | 0.05 | optional, often nicer than top_p alone |
| `repetition_penalty` | 1.05 1.15 | RP models love loops — kill them |
| `max_new_tokens` | 512 1024 | RP needs room |
Always pass `enable_thinking=False` to the chat template — RP doesn't want CoT.
## Limitations
- Trained on a single curated RP dataset; expect a particular tone (vivid, action-asterisk style)
- Not safety-tuned beyond what the base model provides
- English only
## Acknowledgements
- Base model: [Qwen team](https://huggingface.co/Qwen)
- Dataset: [flammenai/flame-kindling-v1](https://huggingface.co/datasets/flammenai/flame-kindling-v1)