初始化项目,由ModelHub XC社区提供模型
Model: HeteroCat/qwen3-0.6b-webnovel Source: Original Platform
This commit is contained in:
134
README.md
Normal file
134
README.md
Normal file
@@ -0,0 +1,134 @@
|
||||
---
|
||||
library_name: transformers
|
||||
base_model: Qwen/Qwen3-0.6B
|
||||
license: apache-2.0
|
||||
tags:
|
||||
- qwen3
|
||||
- llama-factory
|
||||
- chinese
|
||||
- novel
|
||||
- finetuned
|
||||
- webnovel
|
||||
pipeline_tag: text-generation
|
||||
model-index:
|
||||
- name: qwen3-0.6b-webnovel
|
||||
results:
|
||||
- task:
|
||||
type: text-generation
|
||||
name: Text Generation
|
||||
dataset:
|
||||
name: webnovel_cn
|
||||
type: custom
|
||||
metrics:
|
||||
- type: loss
|
||||
value: 3.4621
|
||||
name: Validation Loss
|
||||
---
|
||||
|
||||
# Qwen3-0.6B-WebNovel
|
||||
|
||||
## 模型介绍
|
||||
|
||||
这是一个基于 [Qwen/Qwen3-0.6B](https://modelscope.cn/models/Qwen/Qwen3-0.6B) 进行微调的中文网络小说生成模型。该模型使用 LLaMA-Factory 框架,在中文网络小说数据集上进行了 LoRA 微调训练。
|
||||
|
||||
## 模型特点
|
||||
|
||||
- **基础模型**: Qwen3-0.6B (约5.96亿参数)
|
||||
- **训练方式**: LoRA 微调
|
||||
- **训练数据**: 中文网络小说数据集
|
||||
- **模型大小**: 约2.3GB (合并后)
|
||||
- **上下文长度**: 最大支持 40960 tokens
|
||||
- **语言**: 主要支持中文
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 使用 Transformers
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
import torch
|
||||
|
||||
# 加载模型和 tokenizer
|
||||
model_path = "你的用户名/qwen3-0.6b-webnovel"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_path,
|
||||
device_map="auto",
|
||||
torch_dtype=torch.float16,
|
||||
trust_remote_code=True
|
||||
)
|
||||
|
||||
# 生成文本
|
||||
prompt = "请写一段小说开头:"
|
||||
inputs = tokenizer(prompt, return_tensors="pt")
|
||||
|
||||
outputs = model.generate(
|
||||
inputs["input_ids"],
|
||||
max_new_tokens=500,
|
||||
do_sample=True,
|
||||
temperature=0.7,
|
||||
top_p=0.9,
|
||||
repetition_penalty=1.1
|
||||
)
|
||||
|
||||
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### 使用 ModelScope
|
||||
|
||||
```python
|
||||
from modelscope import AutoTokenizer, AutoModelForCausalLM
|
||||
import torch
|
||||
|
||||
# 加载模型和 tokenizer
|
||||
model_path = "你的用户名/qwen3-0.6b-webnovel"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_path,
|
||||
device_map="auto",
|
||||
torch_dtype=torch.float16,
|
||||
trust_remote_code=True
|
||||
)
|
||||
|
||||
# 生成文本...
|
||||
```
|
||||
|
||||
## 训练详情
|
||||
|
||||
- **学习率**: 5e-05
|
||||
- **批次大小**: 32 (8 x 4)
|
||||
- **训练轮次**: 3 epochs
|
||||
- **优化器**: AdamW
|
||||
- **调度器**: cosine with warmup (500 steps)
|
||||
- **混合精度**: Native AMP
|
||||
- **训练设备**: GPU
|
||||
|
||||
## 性能指标
|
||||
|
||||
- **最终训练损失**: 3.5011
|
||||
- **最终验证损失**: 3.4621
|
||||
- **训练样本数**: 约 135,000
|
||||
- **验证样本数**: 约 17,000
|
||||
|
||||
## 使用建议
|
||||
|
||||
1. **小说创作**: 适合用于中文小说、网络小说的创作
|
||||
2. **风格延续**: 能够续写各种风格的小说内容
|
||||
3. **创意启发**: 可以作为写作的创意灵感来源
|
||||
4. **参数调整**:
|
||||
- 降低 temperature (0.5-0.7) 生成更稳定的内容
|
||||
- 提高 temperature (0.7-1.0) 生成更有创意的内容
|
||||
|
||||
## 限制说明
|
||||
|
||||
- 模型主要用于娱乐和创作辅助,生成的内容请自行审核
|
||||
- 模型可能包含训练数据中的偏见,请注意辨别
|
||||
- 请勿用于生成违法违规内容
|
||||
- 商业使用请遵守 Apache 2.0 许可证
|
||||
|
||||
## 致谢
|
||||
|
||||
- 感谢 [阿里云通义千问](https://modelscope.cn/organization/236563) 提供的基础模型
|
||||
- 感谢 [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) 提供的训练框架
|
||||
- 感谢所有为开源社区贡献的开发者
|
||||
Reference in New Issue
Block a user