初始化项目,由ModelHub XC社区提供模型
Model: Raymond-dev-546730/MaterialsAnalyst-AI-7B Source: Original Platform
This commit is contained in:
63
Training/Training_Documentation.txt
Normal file
63
Training/Training_Documentation.txt
Normal file
@@ -0,0 +1,63 @@
|
||||
MaterialsAnalyst-AI-7B Training Documentation
|
||||
================================================
|
||||
|
||||
Model Training Details
|
||||
---------------------
|
||||
|
||||
Base Model: Qwen 2.5 Instruct 7B
|
||||
Fine-tuning Method: LoRA (Low-Rank Adaptation)
|
||||
Training Infrastructure: Single NVIDIA A100 SXM4 GPU
|
||||
Training Duration: Approximately 5.4 hours
|
||||
Training Dataset: Custom curated dataset for materials analysis
|
||||
|
||||
Dataset Specifications
|
||||
---------------------
|
||||
|
||||
Total Token Count: 6,292,692
|
||||
Total Sample Count: 6,000
|
||||
Average Tokens/Sample: 1048.78
|
||||
Max Token Count: 1,289
|
||||
Min Token Count: 922
|
||||
Tokens Counted Using: tiktoken (cl100k_base encoding)
|
||||
Dataset Creation: Generated using DeepSeekV3 API
|
||||
|
||||
Training Configuration
|
||||
---------------------
|
||||
|
||||
LoRA Parameters:
|
||||
- Rank: 32
|
||||
- Alpha: 64
|
||||
- Dropout: 0.1
|
||||
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head
|
||||
|
||||
Training Hyperparameters:
|
||||
- Learning Rate: 5e-5
|
||||
- Batch Size: 4
|
||||
- Gradient Accumulation: 5
|
||||
- Effective Batch Size: 20
|
||||
- Max Sequence Length: 2048
|
||||
- Epochs: 3
|
||||
- Warmup Ratio: 0.01
|
||||
- Weight Decay: 0.01
|
||||
- Max Grad Norm: 1.0
|
||||
- LR Scheduler: Cosine
|
||||
|
||||
Hardware & Environment
|
||||
---------------------
|
||||
|
||||
GPU: NVIDIA A100 SXM4 (40GB)
|
||||
Operating System: Ubuntu
|
||||
CUDA Version: 11.8
|
||||
PyTorch Version: 2.7.0
|
||||
Compute Capability: 8.0
|
||||
Optimization: FP16, Gradient Checkpointing
|
||||
|
||||
Training Performance
|
||||
---------------------
|
||||
|
||||
Training Runtime: 5.37 hours (19,348 seconds)
|
||||
Train Samples/Second: 0.884
|
||||
Train Steps/Second: 0.044
|
||||
Training Loss (Final): 0.170
|
||||
Validation Loss (Final): 0.136
|
||||
Total Training Steps: 855
|
||||
Reference in New Issue
Block a user