license, datasets
Model Card for Model ID
在llama-2-13b上使用huangyt/FINETUNE4資料集進行訓練,總資料筆數約3.8w
Fine-Tuning Information
- GPU: RTX4090 (single core / 24564MiB)
- model: meta-llama/Llama-2-13b-hf
- dataset: huangyt/FINETUNE3 (共約3.8w筆訓練集)
- peft_type: LoRA
- lora_rank: 16
- lora_target: q_proj, k_proj, v_proj, o_proj
- per_device_train_batch_size: 8
- gradient_accumulation_steps: 8
- learning_rate : 4e-4
- epoch: 1
- precision: bf16
- quantization: load_in_4bit
Fine-Tuning Detail
- train_loss: 0.579
- train_runtime: 4:6:11 (use deepspeed)
Evaluation
- 與Llama-2-13b比較4種Benchmark,包含ARC、HellaSwag、MMLU、TruthfulQA
- 評估結果使用本地所測的分數,並使用load_in_8bit
| Model |
Average |
ARC |
HellaSwag |
MMLU |
TruthfulQA |
| FINETUNE4_3.8w-r4-q_k_v_o |
56.67 |
52.13 |
79.38 |
54.54 |
40.64 |
| FINETUNE4_3.8w-r8-q_k_v_o |
56.84 |
52.30 |
79.58 |
54.50 |
40.98 |
| FINETUNE4_3.8w-r16-q_k_v_o |
57.28 |
53.92 |
79.92 |
55.61 |
39.65 |
| FINETUNE4_3.8w-r4-gate_up_down |
55.93 |
51.71 |
79.13 |
53.24 |
39.63 |
| FINETUNE4_3.8w-r8-gate_up_down |
55.93 |
51.37 |
79.29 |
53.62 |
39.45 |
| FINETUNE4_3.8w-r16-gate_up_down |
56.35 |
52.56 |
79.28 |
55.27 |
38.31 |
| FINETUNE4_3.8w-r4-q_k_v_o_gate_up_down |
56.42 |
53.92 |
79.09 |
53.93 |
38.74 |
| FINETUNE4_3.8w-r8-q_k_v_o_gate_up_down |
56.11 |
51.02 |
79.24 |
53.11 |
41.08 |
| FINETUNE4_3.8w-r16-q_k_v_o_gate_up_down |
56.83 |
53.67 |
79.49 |
54.79 |
39.36 |
- 評估結果來自HuggingFaceH4/open_llm_leaderboard
| Model |
Average |
ARC |
HellaSwag |
MMLU |
TruthfulQA |
| FINETUNE4_3.8w-r4-q_k_v_o |
57.98 |
54.78 |
81.4 |
54.73 |
41.02 |
| FINETUNE4_3.8w-r8-q_k_v_o |
58.96 |
57.68 |
81.91 |
54.95 |
41.31 |
| FINETUNE4_3.8w-r16-q_k_v_o |
58.46 |
56.23 |
81.98 |
55.87 |
39.76 |
| FINETUNE4_3.8w-r4-gate_up_down |
57.94 |
55.8 |
81.74 |
55.09 |
39.12 |
| FINETUNE4_3.8w-r8-gate_up_down |
57.85 |
54.35 |
82.13 |
55.33 |
39.6 |
| FINETUNE4_3.8w-r16-gate_up_down |
57.93 |
55.03 |
81.97 |
56.64 |
38.07 |
| FINETUNE4_3.8w-r4-q_k_v_o_gate_up_down |
58.04 |
56.31 |
81.43 |
55.3 |
39.11 |
| FINETUNE4_3.8w-r8-q_k_v_o_gate_up_down |
58.16 |
55.97 |
81.53 |
54.42 |
40.72 |
| FINETUNE4_3.8w-r16-q_k_v_o_gate_up_down |
58.61 |
57.25 |
81.49 |
55.9 |
39.79 |
How to convert dataset to json
- 在load_dataset中輸入資料集名稱,並且在take中輸入要取前幾筆資料
- 觀察該資料集的欄位名稱,填入example欄位中(例如system_prompt、question、response)
- 最後指定json檔儲存位置 (json_filename)