Files
remnant-qwen3-8b/README-cn.md
ModelHub XC 32469acf9a 初始化项目,由ModelHub XC社区提供模型
Model: allura-org/remnant-qwen3-8b
Source: Original Platform
2026-06-01 14:09:56 +08:00

126 lines
2.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen3-8B-Base
tags:
- roleplay
- conversational
- axolotl
- qwen
---
# 残响 Qwen3 8B第一系列
[English](./README.md) | [简体中文](./README-cn.md)
*空气中飘浮着一缕尘埃。它仿佛来自某个逝去的时代,但你无从追溯。它落在你的舌尖,滋味奇妙。*
![image/png](https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/_ovgodU331FO4YAqFGCnk.png)
「残响」是一系列专注于SFW与NSFW角色扮演及对话的微调大语言模型。
## 量化版本
GGUF:
- 待补充!
EXL3:
- 待补充!
EXL2:
- 待补充!
其他格式:
- 待补充!
## 推荐参数
对话模板: ChatML
采样器设置:
- 温度值 `0.8`
- 最小概率阈值 `0.1`
- 存在惩罚 `0.5`
## 致谢
特别感谢Allura和ilya <3
衷心感谢以下项目的开发者
- Axolotl训练框架
- 通义千问/Qwen/阿里巴巴基础模型
- Prime Intellect算力支持
- 以及我的银行资金支持
## 其他信息
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
<details><summary>查看Axolotl配置</summary>
axolotl版本: `0.10.0.dev0`
```yaml
# === 模型配置 ===
base_model: Qwen/Qwen3-8B-Base
load_in_8bit: false
load_in_4bit: false
# === 训练设置 ===
num_epochs: 2
micro_batch_size: 32
gradient_accumulation_steps: 1
sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
# === 超参数配置 ===
optimizer: apollo_adamw_layerwise
# Apollo-mini配置:
optim_args: "proj=random,rank=1,scale=128.0,scale_type=tensor,update_proj_gap=200"
# 标准Apollo配置:
# optim_args:
optim_target_modules: all_linear
learning_rate: 2e-5
lr_scheduler: rex
weight_decay: 0.01
warmup_ratio: 0
# === 数据配置 ===
datasets:
- path: allura-org/inkmix-v3.0
type: chat_template
split: train
field_messages: conversations
message_field_role: from
message_field_content: value
dataset_prepared_path: last_run_prepared
chat_template: chatml
# === 插件 ===
plugins:
- axolotl.integrations.liger.LigerPlugin
- axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
# === 硬件优化 ===
gradient_checkpointing: unsloth
gradient_checkpointing_kwargs:
use_reentrant: false
liger_rope: true
liger_rms_norm: true
liger_glu_activation: true
cut_cross_entropy: true
# === Wandb追踪 ===
wandb_project: qwen3-8b-inkmix-v3
# === 检查点 ===
saves_per_epoch: 2
save_total_limit: 3
# === 高级设置 ===
output_dir: /ephemeral/ckpts
bf16: auto
flash_attention: true
train_on_inputs: false
group_by_length: false
logging_steps: 1
trust_remote_code: true
```
</details>