Files
QWEN3-4B-CPT-stage2/train_results.json

8 lines
208 B
JSON
Raw Normal View History

{
"epoch": 1.0,
"total_flos": 1.6013083311596544e+16,
"train_loss": 1.9445130242241753,
"train_runtime": 226.5501,
"train_samples_per_second": 15.727,
"train_steps_per_second": 0.397
}