初始化项目,由ModelHub XC社区提供模型
Model: Raymond-dev-546730/MaterialsAnalyst-AI-7B Source: Original Platform
This commit is contained in:
63
Training/Training_Documentation.txt
Normal file
63
Training/Training_Documentation.txt
Normal file
@@ -0,0 +1,63 @@
|
||||
MaterialsAnalyst-AI-7B Training Documentation
|
||||
================================================
|
||||
|
||||
Model Training Details
|
||||
---------------------
|
||||
|
||||
Base Model: Qwen 2.5 Instruct 7B
|
||||
Fine-tuning Method: LoRA (Low-Rank Adaptation)
|
||||
Training Infrastructure: Single NVIDIA A100 SXM4 GPU
|
||||
Training Duration: Approximately 5.4 hours
|
||||
Training Dataset: Custom curated dataset for materials analysis
|
||||
|
||||
Dataset Specifications
|
||||
---------------------
|
||||
|
||||
Total Token Count: 6,292,692
|
||||
Total Sample Count: 6,000
|
||||
Average Tokens/Sample: 1048.78
|
||||
Max Token Count: 1,289
|
||||
Min Token Count: 922
|
||||
Tokens Counted Using: tiktoken (cl100k_base encoding)
|
||||
Dataset Creation: Generated using DeepSeekV3 API
|
||||
|
||||
Training Configuration
|
||||
---------------------
|
||||
|
||||
LoRA Parameters:
|
||||
- Rank: 32
|
||||
- Alpha: 64
|
||||
- Dropout: 0.1
|
||||
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head
|
||||
|
||||
Training Hyperparameters:
|
||||
- Learning Rate: 5e-5
|
||||
- Batch Size: 4
|
||||
- Gradient Accumulation: 5
|
||||
- Effective Batch Size: 20
|
||||
- Max Sequence Length: 2048
|
||||
- Epochs: 3
|
||||
- Warmup Ratio: 0.01
|
||||
- Weight Decay: 0.01
|
||||
- Max Grad Norm: 1.0
|
||||
- LR Scheduler: Cosine
|
||||
|
||||
Hardware & Environment
|
||||
---------------------
|
||||
|
||||
GPU: NVIDIA A100 SXM4 (40GB)
|
||||
Operating System: Ubuntu
|
||||
CUDA Version: 11.8
|
||||
PyTorch Version: 2.7.0
|
||||
Compute Capability: 8.0
|
||||
Optimization: FP16, Gradient Checkpointing
|
||||
|
||||
Training Performance
|
||||
---------------------
|
||||
|
||||
Training Runtime: 5.37 hours (19,348 seconds)
|
||||
Train Samples/Second: 0.884
|
||||
Train Steps/Second: 0.044
|
||||
Training Loss (Final): 0.170
|
||||
Validation Loss (Final): 0.136
|
||||
Total Training Steps: 855
|
||||
109
Training/Training_Logs.txt
Normal file
109
Training/Training_Logs.txt
Normal file
@@ -0,0 +1,109 @@
|
||||
Loading tokenizer...
|
||||
Loading dataset from ./Dataset.jsonl
|
||||
Loaded 6000 samples
|
||||
Training on 5700 samples, validating on 300 samples
|
||||
Loading model...
|
||||
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00, 1.23s/it]
|
||||
Trainable parameters: 85,721,088 (1.11% of 7,701,337,600)
|
||||
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
|
||||
Starting training...
|
||||
{'loss': 1.0399, 'grad_norm': 0.704595148563385, 'learning_rate': 5e-05, 'epoch': 0.04}
|
||||
{'loss': 0.5935, 'grad_norm': 0.47508421540260315, 'learning_rate': 4.998276468898823e-05, 'epoch': 0.07}
|
||||
{'loss': 0.3596, 'grad_norm': 0.313719779253006, 'learning_rate': 4.993108252042854e-05, 'epoch': 0.11}
|
||||
{'loss': 0.2979, 'grad_norm': 0.31504514813423157, 'learning_rate': 4.9845024754980876e-05, 'epoch': 0.14}
|
||||
{'loss': 0.2571, 'grad_norm': 0.3241384029388428, 'learning_rate': 4.97247100512334e-05, 'epoch': 0.18}
|
||||
{'loss': 0.2425, 'grad_norm': 0.31259262561798096, 'learning_rate': 4.9570304302093216e-05, 'epoch': 0.21}
|
||||
{'loss': 0.2251, 'grad_norm': 0.34634077548980713, 'learning_rate': 4.938202040604898e-05, 'epoch': 0.25}
|
||||
{'loss': 0.2125, 'grad_norm': 0.33605319261550903, 'learning_rate': 4.916011797362123e-05, 'epoch': 0.28}
|
||||
{'loss': 0.2091, 'grad_norm': 0.3671645522117615, 'learning_rate': 4.890490296940496e-05, 'epoch': 0.32}
|
||||
{'loss': 0.1996, 'grad_norm': 0.3266342580318451, 'learning_rate': 4.861672729019797e-05, 'epoch': 0.35}
|
||||
{'loss': 0.1901, 'grad_norm': 0.37788864970207214, 'learning_rate': 4.829598827979682e-05, 'epoch': 0.39}
|
||||
{'loss': 0.183, 'grad_norm': 0.3393491208553314, 'learning_rate': 4.794312818112935e-05, 'epoch': 0.42}
|
||||
{'loss': 0.1823, 'grad_norm': 0.34580835700035095, 'learning_rate': 4.755863352647909e-05, 'epoch': 0.46}
|
||||
{'loss': 0.183, 'grad_norm': 0.33839499950408936, 'learning_rate': 4.7143034466642464e-05, 'epoch': 0.49}
|
||||
{'loss': 0.1837, 'grad_norm': 0.3656191825866699, 'learning_rate': 4.669690403994367e-05, 'epoch': 0.53}
|
||||
{'loss': 0.1711, 'grad_norm': 0.3499873876571655, 'learning_rate': 4.622085738211518e-05, 'epoch': 0.56}
|
||||
{'loss': 0.1738, 'grad_norm': 0.34751611948013306, 'learning_rate': 4.57155508781333e-05, 'epoch': 0.6}
|
||||
{'loss': 0.1716, 'grad_norm': 0.3428999185562134, 'learning_rate': 4.518168125717824e-05, 'epoch': 0.63}
|
||||
{'loss': 0.174, 'grad_norm': 0.37544649839401245, 'learning_rate': 4.4619984631966524e-05, 'epoch': 0.67}
|
||||
{'loss': 0.1673, 'grad_norm': 0.32618045806884766, 'learning_rate': 4.403123548378055e-05, 'epoch': 0.7}
|
||||
{'loss': 0.164, 'grad_norm': 0.3500118851661682, 'learning_rate': 4.341624559459447e-05, 'epoch': 0.74}
|
||||
{'loss': 0.1656, 'grad_norm': 0.35224637389183044, 'learning_rate': 4.2775862927769025e-05, 'epoch': 0.77}
|
||||
{'loss': 0.1641, 'grad_norm': 0.3303431570529938, 'learning_rate': 4.2110970458858546e-05, 'epoch': 0.81}
|
||||
{'loss': 0.1628, 'grad_norm': 0.3580450117588043, 'learning_rate': 4.1422484958142326e-05, 'epoch': 0.84}
|
||||
{'loss': 0.1637, 'grad_norm': 0.3214457035064697, 'learning_rate': 4.071135572655892e-05, 'epoch': 0.88}
|
||||
{'loss': 0.1626, 'grad_norm': 0.3646449148654938, 'learning_rate': 3.99785632867864e-05, 'epoch': 0.91}
|
||||
{'loss': 0.1629, 'grad_norm': 0.32008472084999084, 'learning_rate': 3.922511803127329e-05, 'epoch': 0.95}
|
||||
{'loss': 0.1563, 'grad_norm': 0.33964064717292786, 'learning_rate': 3.845205882908432e-05, 'epoch': 0.98}
|
||||
{'eval_loss': 0.15612632036209106, 'eval_runtime': 106.8808, 'eval_samples_per_second': 2.807, 'eval_steps_per_second': 0.356, 'epoch': 1.0}
|
||||
33%|█████████████████████████████████████████▎ | 285/855 [1:47:02<3:31:58, 22.31s/it/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
|
||||
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
|
||||
{'loss': 0.1529, 'grad_norm': 0.33749136328697205, 'learning_rate': 3.766045159348191e-05, 'epoch': 1.02}
|
||||
{'loss': 0.1499, 'grad_norm': 0.36011260747909546, 'learning_rate': 3.685138781221844e-05, 'epoch': 1.05}
|
||||
{'loss': 0.1473, 'grad_norm': 0.35543763637542725, 'learning_rate': 3.6025983042565795e-05, 'epoch': 1.09}
|
||||
{'loss': 0.1463, 'grad_norm': 0.34852930903434753, 'learning_rate': 3.51853753731572e-05, 'epoch': 1.12}
|
||||
{'loss': 0.1479, 'grad_norm': 0.35828718543052673, 'learning_rate': 3.433072385476237e-05, 'epoch': 1.16}
|
||||
{'loss': 0.1468, 'grad_norm': 0.33792921900749207, 'learning_rate': 3.3463206902159395e-05, 'epoch': 1.19}
|
||||
{'loss': 0.1489, 'grad_norm': 0.34330445528030396, 'learning_rate': 3.2584020669307146e-05, 'epoch': 1.23}
|
||||
{'loss': 0.1501, 'grad_norm': 0.33124926686286926, 'learning_rate': 3.169437740005849e-05, 'epoch': 1.26}
|
||||
{'loss': 0.1434, 'grad_norm': 0.33036795258522034, 'learning_rate': 3.079550375668821e-05, 'epoch': 1.3}
|
||||
{'loss': 0.1468, 'grad_norm': 0.33003684878349304, 'learning_rate': 2.9888639128540615e-05, 'epoch': 1.33}
|
||||
{'loss': 0.1455, 'grad_norm': 0.3746543824672699, 'learning_rate': 2.8975033923128642e-05, 'epoch': 1.37}
|
||||
{'loss': 0.1489, 'grad_norm': 0.3552297353744507, 'learning_rate': 2.8055947842040862e-05, 'epoch': 1.4}
|
||||
{'loss': 0.1436, 'grad_norm': 0.35433629155158997, 'learning_rate': 2.713264814403362e-05, 'epoch': 1.44}
|
||||
{'loss': 0.1477, 'grad_norm': 0.36622723937034607, 'learning_rate': 2.6206407897703095e-05, 'epoch': 1.47}
|
||||
{'loss': 0.1428, 'grad_norm': 0.32289159297943115, 'learning_rate': 2.5278504226146636e-05, 'epoch': 1.51}
|
||||
{'loss': 0.143, 'grad_norm': 0.3453490734100342, 'learning_rate': 2.4350216546033738e-05, 'epoch': 1.54}
|
||||
{'loss': 0.1398, 'grad_norm': 0.35188964009284973, 'learning_rate': 2.3422824803514384e-05, 'epoch': 1.58}
|
||||
{'loss': 0.1415, 'grad_norm': 0.36733925342559814, 'learning_rate': 2.2497607709397543e-05, 'epoch': 1.61}
|
||||
{'loss': 0.1424, 'grad_norm': 0.34543827176094055, 'learning_rate': 2.1575840976032867e-05, 'epoch': 1.65}
|
||||
{'loss': 0.1401, 'grad_norm': 0.34187060594558716, 'learning_rate': 2.0658795558326743e-05, 'epoch': 1.68}
|
||||
{'loss': 0.1366, 'grad_norm': 0.386068731546402, 'learning_rate': 1.974773590131805e-05, 'epoch': 1.72}
|
||||
{'loss': 0.1404, 'grad_norm': 0.34624332189559937, 'learning_rate': 1.884391819672991e-05, 'epoch': 1.75}
|
||||
{'loss': 0.1405, 'grad_norm': 0.32291245460510254, 'learning_rate': 1.794858865090123e-05, 'epoch': 1.79}
|
||||
{'loss': 0.1422, 'grad_norm': 0.3514968454837799, 'learning_rate': 1.7062981766486437e-05, 'epoch': 1.82}
|
||||
{'loss': 0.1389, 'grad_norm': 0.3431772291660309, 'learning_rate': 1.618831864029251e-05, 'epoch': 1.86}
|
||||
{'loss': 0.1363, 'grad_norm': 0.3377828896045685, 'learning_rate': 1.5325805279600286e-05, 'epoch': 1.89}
|
||||
{'loss': 0.1373, 'grad_norm': 0.3381596803665161, 'learning_rate': 1.447663093929163e-05, 'epoch': 1.93}
|
||||
{'loss': 0.1387, 'grad_norm': 0.3349764049053192, 'learning_rate': 1.3641966482075208e-05, 'epoch': 1.96}
|
||||
{'loss': 0.1376, 'grad_norm': 0.3536580801010132, 'learning_rate': 1.282296276407189e-05, 'epoch': 2.0}
|
||||
{'eval_loss': 0.13975860178470612, 'eval_runtime': 106.9279, 'eval_samples_per_second': 2.806, 'eval_steps_per_second': 0.355, 'epoch': 2.0}
|
||||
67%|██████████████████████████████████████████████████████████████████████████████████▋ | 570/855 [3:35:09<1:46:05, 22.33s/it/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
|
||||
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
|
||||
{'loss': 0.1307, 'grad_norm': 0.34761759638786316, 'learning_rate': 1.2020749047985627e-05, 'epoch': 2.04}
|
||||
{'loss': 0.1275, 'grad_norm': 0.3375226557254791, 'learning_rate': 1.1236431446047985e-05, 'epoch': 2.07}
|
||||
{'loss': 0.1263, 'grad_norm': 0.382931113243103, 'learning_rate': 1.0471091394883086e-05, 'epoch': 2.11}
|
||||
{'loss': 0.1299, 'grad_norm': 0.3302248418331146, 'learning_rate': 9.72578416439587e-06, 'epoch': 2.14}
|
||||
{'loss': 0.1316, 'grad_norm': 0.3511696457862854, 'learning_rate': 9.001537402739656e-06, 'epoch': 2.18}
|
||||
{'loss': 0.1333, 'grad_norm': 0.3526867926120758, 'learning_rate': 8.29934971936938e-06, 'epoch': 2.21}
|
||||
{'loss': 0.1284, 'grad_norm': 0.3544050455093384, 'learning_rate': 7.620189308133943e-06, 'epoch': 2.25}
|
||||
{'loss': 0.1278, 'grad_norm': 0.34632644057273865, 'learning_rate': 6.964992612306526e-06, 'epoch': 2.28}
|
||||
{'loss': 0.13, 'grad_norm': 0.3530580997467041, 'learning_rate': 6.334663033393229e-06, 'epoch': 2.32}
|
||||
{'loss': 0.1307, 'grad_norm': 0.4017309546470642, 'learning_rate': 5.730069685500669e-06, 'epoch': 2.35}
|
||||
{'loss': 0.1356, 'grad_norm': 0.3461137115955353, 'learning_rate': 5.1520461969797565e-06, 'epoch': 2.39}
|
||||
{'loss': 0.1276, 'grad_norm': 0.34974604845046997, 'learning_rate': 4.60138956099824e-06, 'epoch': 2.42}
|
||||
{'loss': 0.1284, 'grad_norm': 0.3364820182323456, 'learning_rate': 4.078859036626676e-06, 'epoch': 2.46}
|
||||
{'loss': 0.1252, 'grad_norm': 0.33579540252685547, 'learning_rate': 3.5851751019531088e-06, 'epoch': 2.49}
|
||||
{'loss': 0.1257, 'grad_norm': 0.3575705587863922, 'learning_rate': 3.121018460669986e-06, 'epoch': 2.53}
|
||||
{'loss': 0.1283, 'grad_norm': 0.34064194560050964, 'learning_rate': 2.687029103502972e-06, 'epoch': 2.56}
|
||||
{'loss': 0.1279, 'grad_norm': 0.337855726480484, 'learning_rate': 2.283805425775784e-06, 'epoch': 2.6}
|
||||
{'loss': 0.1285, 'grad_norm': 0.35349616408348083, 'learning_rate': 1.9119034023278637e-06, 'epoch': 2.63}
|
||||
{'loss': 0.1319, 'grad_norm': 0.34553083777427673, 'learning_rate': 1.5718358209224153e-06, 'epoch': 2.67}
|
||||
{'loss': 0.1283, 'grad_norm': 0.3534978926181793, 'learning_rate': 1.2640715752018778e-06, 'epoch': 2.7}
|
||||
{'loss': 0.1294, 'grad_norm': 0.3369212746620178, 'learning_rate': 9.890350181657126e-07, 'epoch': 2.74}
|
||||
{'loss': 0.1242, 'grad_norm': 0.3364659249782562, 'learning_rate': 7.471053770619352e-07, 'epoch': 2.77}
|
||||
{'loss': 0.128, 'grad_norm': 0.33932891488075256, 'learning_rate': 5.386162304991394e-07, 'epoch': 2.81}
|
||||
{'loss': 0.1295, 'grad_norm': 0.36040279269218445, 'learning_rate': 3.638550485000031e-07, 'epoch': 2.84}
|
||||
{'loss': 0.128, 'grad_norm': 0.3814302086830139, 'learning_rate': 2.230627961304993e-07, 'epoch': 2.88}
|
||||
{'loss': 0.1271, 'grad_norm': 0.3205372095108032, 'learning_rate': 1.1643360125123126e-07, 'epoch': 2.91}
|
||||
{'loss': 0.131, 'grad_norm': 0.37936437129974365, 'learning_rate': 4.411448684913666e-08, 'epoch': 2.95}
|
||||
{'loss': 0.1306, 'grad_norm': 0.3545021116733551, 'learning_rate': 6.205168318523802e-09, 'epoch': 2.98}
|
||||
{'eval_loss': 0.1355518102645874, 'eval_runtime': 105.9093, 'eval_samples_per_second': 2.833, 'eval_steps_per_second': 0.359, 'epoch': 3.0}
|
||||
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 855/855 [5:22:24<00:00, 22.17s/it/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
|
||||
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
|
||||
{'train_runtime': 19348.838, 'train_samples_per_second': 0.884, 'train_steps_per_second': 0.044, 'train_loss': 0.17026110399536223, 'epoch': 3.0}
|
||||
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 855/855 [5:22:29<00:00, 22.63s/it]
|
||||
Saving model...
|
||||
/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
|
||||
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
|
||||
Model saved to ./MaterialsAnalyst-AI-7B_LoRA_adapter
|
||||
Reference in New Issue
Block a user