Files
QWEN3-4B-Base-stage2/train_results.json

8 lines
208 B
JSON
Raw Normal View History

{
"epoch": 1.0,
"total_flos": 1.6013083311596544e+16,
"train_loss": 1.9567182964748806,
"train_runtime": 294.7814,
"train_samples_per_second": 12.087,
"train_steps_per_second": 0.305
}