Files
Qwen3-1.7B-Yukari-SFT/README.md
ModelHub XC a01acf4f63 初始化项目,由ModelHub XC社区提供模型
Model: zzoceanpie/Qwen3-1.7B-Yukari-SFT
Source: Original Platform
2026-06-16 04:53:16 +08:00

91 lines
3.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model: Qwen/Qwen3-1.7B
library_name: transformers
tags:
- qlora
- roleplay
- qwen
- sft
- touhou
- character-roleplay
- synthetic-data
- generated_from_trainer
language:
- zh
license: apache-2.0
---
# Qwen3-1.7B-Yukari-SFT
八云紫 (Yukari Yakumo) 角色扮演模型 — SFT 初版(已被 SFT-v2 取代)。
## 模型信息
| 项目 | 值 |
|------|-----|
| 基座 | [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) |
| 方法 | QLoRA 4-bit NF4 (双重量化), bf16 |
| LoRA | **r=8, alpha=8**, target: q/k/v/o/gate/up/down proj |
| 数据 | 3,691 条合成 SFT 对话 (MiMo-V2.5-Pro 生成) |
| 硬件 | RTX 3060 Laptop 6GB |
| Loss | 5.27 to 1.55 |
| 格式 | Merged bf16 safetensors + GGUF F16 |
## 系列模型
| 模型 | LoRA | 方法 | 版本 |
|------|------|------|------|
| **Qwen3-1.7B-Yukari-SFT** (本模型) | r=8, a=8 | SFT | v1 |
| [Qwen3-1.7B-Yukari-SFT-v2](https://hf.co/zzoceanpie/Qwen3-1.7B-Yukari-SFT-v2) | r=32, a=64 | SFT | v2 |
| [Qwen3-1.7B-Yukari-DPO](https://hf.co/zzoceanpie/Qwen3-1.7B-Yukari-DPO) | r=32, a=64 | SFT+DPO | v2+DPO |
## 用法
模型输入格式为 `[情绪标签]\n用户输入`标签控制八云紫的回复语气。8 维 Plutchik 情绪向量离散化:喜悦/愤怒/悲伤/恐惧/厌恶/惊讶/信任/期待,每维取「无」「极轻微」「轻微」「中轻度」「中度」「较强烈」「强烈」「极强烈」。
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"zzoceanpie/Qwen3-1.7B-Yukari-SFT",
torch_dtype="auto", device_map="auto", trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("zzoceanpie/Qwen3-1.7B-Yukari-SFT")
# 组装情绪标签
tags = "[<|中度喜悦|><|轻微愤怒|><|轻微悲伤|><|无恐惧|><|无厌恶|><|无惊讶|><|轻微信任|><|中度期待|>]"
prompt = f"{tags}\n你觉得人类最有趣的地方是什么?"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.8)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### GGUF
```python
from llama_cpp import Llama
model = Llama(model_path="Qwen3-1.7B-Yukari-SFT-f16.gguf", n_ctx=2048, n_gpu_layers=-1)
response = model.create_chat_completion(
messages=[{"role": "user", "content": prompt}],
max_tokens=200, temperature=0.8
)
print(response["choices"][0]["message"]["content"])
```
## 许可 / License
- 模型权重: Apache 2.0
- 角色「八云紫」版权: (C) 上海爱丽丝幻乐团 (Team Shanghai Alice) / ZUN
- 本模型是东方Project的二次创作/粉丝作品。遵循 [东方Project二次创作指南](https://touhou-project.news/guidelines_en/)。
- 训练数据: [yukari-synthetic](https://hf.co/datasets/zzoceanpie/yukari-synthetic) (Apache 2.0)
- *This is a fan-made derivative work based on Touhou Project. Touhou Project is (C) Team Shanghai Alice / ZUN.*
> This model is a fan-made derivative work based on Touhou Project.
> Touhou Project is (C) Team Shanghai Alice / ZUN.
> Please refer to the [official guidelines](https://touhou-project.news/guidelines_en/) for Touhou fan content.