Files

29 lines
804 B
Markdown
Raw Permalink Normal View History

---
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
language:
- en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- structured-output
- merged
---
# qwen3-4b-structured-sft-lora-v07-merged
Fully merged model (base + LoRA) fine-tuned from **Qwen/Qwen3-4B-Instruct-2507**.
## v07 変更点
- **SFT LR**: 2e-6 → **2e-5**v03比10倍。ガイド著者実績値
- 他はv03と同一データセット・MASK_COT=1・LoRA r=64・Epoch=2
## Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit) → merged
- LR: 2e-05 / Epochs: 2 / LoRA: r=64, alpha=128
- Dataset: u-10bei/structured_data_with_cot_dataset_512_v23933件
- MASK_COT: 1CoT保持・lossマスク