--- base_model: Qwen/Qwen3-4B-Instruct-2507 datasets: - u-10bei/structured_data_with_cot_dataset_512_v2 language: - en license: apache-2.0 library_name: transformers pipeline_tag: text-generation tags: - structured-output - merged --- # qwen3-4b-structured-sft-lora-v07-merged Fully merged model (base + LoRA) fine-tuned from **Qwen/Qwen3-4B-Instruct-2507**. ## v07 変更点 - **SFT LR**: 2e-6 → **2e-5**(v03比10倍。ガイド著者実績値) - 他はv03と同一(データセット・MASK_COT=1・LoRA r=64・Epoch=2) ## Training Configuration - Base model: Qwen/Qwen3-4B-Instruct-2507 - Method: QLoRA (4-bit) → merged - LR: 2e-05 / Epochs: 2 / LoRA: r=64, alpha=128 - Dataset: u-10bei/structured_data_with_cot_dataset_512_v2(3933件) - MASK_COT: 1(CoT保持・lossマスク)