Files
qwen3-4b-structured-sft-lor…/README.md
ModelHub XC 74e68693d4 初始化项目,由ModelHub XC社区提供模型
Model: deepkick/qwen3-4b-structured-sft-lora-v07-merged
Source: Original Platform
2026-05-18 02:08:32 +08:00

29 lines
804 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
language:
- en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- structured-output
- merged
---
# qwen3-4b-structured-sft-lora-v07-merged
Fully merged model (base + LoRA) fine-tuned from **Qwen/Qwen3-4B-Instruct-2507**.
## v07 変更点
- **SFT LR**: 2e-6 → **2e-5**v03比10倍。ガイド著者実績値
- 他はv03と同一データセット・MASK_COT=1・LoRA r=64・Epoch=2
## Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit) → merged
- LR: 2e-05 / Epochs: 2 / LoRA: r=64, alpha=128
- Dataset: u-10bei/structured_data_with_cot_dataset_512_v23933件
- MASK_COT: 1CoT保持・lossマスク