Go to file

ModelHub XC c2e3f0b8bd 初始化项目，由ModelHub XC社区提供模型

Model: Pranavz/qwen-4b-2507-rp-mahou
Source: Original Platform

2026-06-16 07:47:16 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

training_args.bin

初始化项目，由ModelHub XC社区提供模型

2026-06-16 07:47:16 +08:00

README.md

license, base_model, tags, datasets, language, pipeline_tag, library_name

license

base_model

qwen-4b-2507-rp-mahou

A full-parameter SFT of Qwen/Qwen3-4B-Instruct-2507 on flammenai/flame-kindling-v1 for creative roleplay and character interaction.

Highlights

Base: Qwen3-4B-Instruct-2507
Method: full-sequence SFT (no LoRA)
Dataset: flame-kindling-v1 (RP / creative writing)
Precision: bf16
Chat template: Qwen3 (use enable_thinking=False for RP)

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

MODEL_ID = "Pranavz/qwen-4b-2507-rp-mahou"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a creative roleplay assistant. Stay in character, write vividly, and use asterisks for actions."},
    {"role": "user", "content": "*walks into the tavern, shaking off the rain* Evening, barkeep. Got a room?"},
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.inference_mode():
    out = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.8,
        top_p=0.9,
        top_k=40,
        repetition_penalty=1.1,
        do_sample=True,
    )

print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Recommended sampler settings

Parameter	Value	Notes
`temperature`	0.7 – 0.85	creative without going off-rails
`top_p`	0.9	trim the long tail
`top_k`	40	hard vocab cap
`min_p`	0.05	optional, often nicer than top_p alone
`repetition_penalty`	1.05 – 1.15	RP models love loops — kill them
`max_new_tokens`	512 – 1024	RP needs room

Always pass enable_thinking=False to the chat template — RP doesn't want CoT.

Limitations

Trained on a single curated RP dataset; expect a particular tone (vivid, action-asterisk style)
Not safety-tuned beyond what the base model provides
English only

Acknowledgements

Base model: Qwen team
Dataset: flammenai/flame-kindling-v1

README.md Unescape Escape

qwen-4b-2507-rp-mahou

Highlights

Usage

Recommended sampler settings

Limitations

Acknowledgements

README.md