初始化项目，由ModelHub XC社区提供模型

Model: Pranavz/qwen-4b-2507-rp-mahou Source: Original Platform
2026-06-16 07:47:16 +08:00
commit c2e3f0b8bd
9 changed files with 313 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,94 @@
+---
+license: apache-2.0
+base_model: Qwen/Qwen3-4B-Instruct-2507
+tags:
+  - roleplay
+  - creative-writing
+  - sft
+  - qwen3
+datasets:
+  - flammenai/flame-kindling-v1
+language:
+  - en
+pipeline_tag: text-generation
+library_name: transformers
+---
+
+# qwen-4b-2507-rp-mahou
+
+A full-parameter SFT of [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) on [`flammenai/flame-kindling-v1`](https://huggingface.co/datasets/flammenai/flame-kindling-v1) for creative roleplay and character interaction.
+
+## Highlights
+
+- Base: Qwen3-4B-Instruct-2507
+- Method: full-sequence SFT (no LoRA)
+- Dataset: flame-kindling-v1 (RP / creative writing)
+- Precision: bf16
+- Chat template: Qwen3 (use `enable_thinking=False` for RP)
+
+## Usage
+
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+MODEL_ID = "Pranavz/qwen-4b-2507-rp-mahou"
+
+tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
+model = AutoModelForCausalLM.from_pretrained(
+    MODEL_ID,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+
+messages = [
+    {"role": "system", "content": "You are a creative roleplay assistant. Stay in character, write vividly, and use asterisks for actions."},
+    {"role": "user", "content": "*walks into the tavern, shaking off the rain* Evening, barkeep. Got a room?"},
+]
+
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True,
+    enable_thinking=False,
+)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+
+with torch.inference_mode():
+    out = model.generate(
+        **inputs,
+        max_new_tokens=512,
+        temperature=0.8,
+        top_p=0.9,
+        top_k=40,
+        repetition_penalty=1.1,
+        do_sample=True,
+    )
+
+print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
+```
+
+## Recommended sampler settings
+
+| Parameter | Value | Notes |
+|---|---|---|
+| `temperature` | 0.7 – 0.85 | creative without going off-rails |
+| `top_p` | 0.9 | trim the long tail |
+| `top_k` | 40 | hard vocab cap |
+| `min_p` | 0.05 | optional, often nicer than top_p alone |
+| `repetition_penalty` | 1.05 – 1.15 | RP models love loops — kill them |
+| `max_new_tokens` | 512 – 1024 | RP needs room |
+
+Always pass `enable_thinking=False` to the chat template — RP doesn't want CoT.
+
+
+## Limitations
+
+- Trained on a single curated RP dataset; expect a particular tone (vivid, action-asterisk style)
+- Not safety-tuned beyond what the base model provides
+- English only
+
+## Acknowledgements
+
+- Base model: [Qwen team](https://huggingface.co/Qwen)
+- Dataset: [flammenai/flame-kindling-v1](https://huggingface.co/datasets/flammenai/flame-kindling-v1)