Files
ModelHub XC 6b70975939 初始化项目,由ModelHub XC社区提供模型
Model: zzoceanpie/Qwen3-1.7B-Yukari-SFT-v2
Source: Original Platform
2026-06-16 04:34:18 +08:00

90 lines
3.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model: Qwen/Qwen3-1.7B
library_name: transformers
tags:
- qlora
- roleplay
- qwen
- sft
- touhou
- character-roleplay
- synthetic-data
- generated_from_trainer
language:
- zh
license: apache-2.0
---
# Qwen3-1.7B-Yukari-SFT-v2
八云紫 (Yukari Yakumo) 角色扮演模型 — SFT 改进版。
v1 (r=8) 的基础上提高 LoRA rank 至 32改用 1,233 条纯原文 SFT 数据解决攻击性输出问题。
## 模型信息
| 项目 | 值 |
|------|-----|
| 基座 | [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) |
| 方法 | QLoRA 4-bit NF4 (双重量化), bf16 |
| LoRA | **r=32, alpha=64**, target: q/k/v/o/gate/up/down proj |
| 数据 | 1,233 条合成 SFT 对话 (MiMo-V2.5-Pro, 纯原文无情绪变体) |
| 训练 | 1 epoch, seed=478 |
| 格式 | Merged bf16 safetensors + GGUF F16 |
## 系列模型
| 模型 | LoRA | 方法 | 版本 |
|------|------|------|------|
| [Qwen3-1.7B-Yukari-SFT](https://hf.co/zzoceanpie/Qwen3-1.7B-Yukari-SFT) | r=8, a=8 | SFT | v1 |
| **Qwen3-1.7B-Yukari-SFT-v2** (本模型) | r=32, a=64 | SFT | v2 |
| [Qwen3-1.7B-Yukari-DPO](https://hf.co/zzoceanpie/Qwen3-1.7B-Yukari-DPO) | r=32, a=64 | SFT+DPO | v2+DPO |
## 用法
模型输入格式为 `[情绪标签]\n用户输入`标签控制八云紫的回复语气。8 维 Plutchik 情绪向量离散化:喜悦/愤怒/悲伤/恐惧/厌恶/惊讶/信任/期待,每维取「无」「极轻微」「轻微」「中轻度」「中度」「较强烈」「强烈」「极强烈」。
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"zzoceanpie/Qwen3-1.7B-Yukari-SFT-v2",
torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("zzoceanpie/Qwen3-1.7B-Yukari-SFT-v2")
tags = "[<|中度喜悦|><|轻微愤怒|><|轻微悲伤|><|无恐惧|><|无厌恶|><|无惊讶|><|轻微信任|><|中度期待|>]"
prompt = f"{tags}\n你觉得人类最有趣的地方是什么?"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.8)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### GGUF
```python
from llama_cpp import Llama
model = Llama(model_path="Qwen3-1.7B-Yukari-SFT-f16.gguf", n_ctx=2048, n_gpu_layers=-1)
response = model.create_chat_completion(
messages=[{"role": "user", "content": prompt}],
max_tokens=200, temperature=0.8
)
print(response["choices"][0]["message"]["content"])
```
## 许可 / License
- 模型权重: Apache 2.0
- 角色「八云紫」版权: (C) 上海爱丽丝幻乐团 (Team Shanghai Alice) / ZUN
- 本模型是东方Project的二次创作/粉丝作品。遵循 [东方Project二次创作指南](https://touhou-project.news/guidelines_en/)。
- 训练数据: [yukari-synthetic](https://hf.co/datasets/zzoceanpie/yukari-synthetic) (Apache 2.0)
- *This is a fan-made derivative work based on Touhou Project. Touhou Project is (C) Team Shanghai Alice / ZUN.*
> This model is a fan-made derivative work based on Touhou Project.
> Touhou Project is (C) Team Shanghai Alice / ZUN.
> Please refer to the [official guidelines](https://touhou-project.news/guidelines_en/) for Touhou fan content.