Model: deepkick/qwen3-4b-structured-sft-lora-v07-merged Source: Original Platform
base_model, datasets, language, license, library_name, pipeline_tag, tags
| base_model | datasets | language | license | library_name | pipeline_tag | tags | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| Qwen/Qwen3-4B-Instruct-2507 |
|
|
apache-2.0 | transformers | text-generation |
|
qwen3-4b-structured-sft-lora-v07-merged
Fully merged model (base + LoRA) fine-tuned from Qwen/Qwen3-4B-Instruct-2507.
v07 変更点
- SFT LR: 2e-6 → 2e-5(v03比10倍。ガイド著者実績値)
- 他はv03と同一(データセット・MASK_COT=1・LoRA r=64・Epoch=2)
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit) → merged
- LR: 2e-05 / Epochs: 2 / LoRA: r=64, alpha=128
- Dataset: u-10bei/structured_data_with_cot_dataset_512_v2(3933件)
- MASK_COT: 1(CoT保持・lossマスク)
Description
Languages
Jinja
100%