Files
Qwen1.5-MOE-sft-gsm/train_results.json

8 lines
217 B
JSON
Raw Normal View History

{
"total_flos": 3.6134539308407194e+17,
"train_loss": 0.15647654232178998,
"train_runtime": 1906.4028,
"train_samples": 7473,
"train_samples_per_second": 7.84,
"train_steps_per_second": 0.245
}