Files
ModelHub XC aa0222e4a1 初始化项目,由ModelHub XC社区提供模型
Model: rbelanec/train_cola_42_1776331560
Source: Original Platform
2026-05-03 10:17:08 +08:00

2.8 KiB

library_name, license, base_model, tags, model-index
library_name license base_model tags model-index
transformers llama3.2 meta-llama/Llama-3.2-1B-Instruct
peft-factory
full
llama-factory
generated_from_trainer
name results
train_cola_42_1776331560

train_cola_42_1776331560

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1763
  • Num Input Tokens Seen: 1932608

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2021 0.2505 241 0.2780 97664
0.2402 0.5010 482 0.2002 194560
0.1906 0.7516 723 0.2094 291712
0.2397 1.0021 964 0.1763 387464
0.0622 1.2526 1205 0.2676 485192
0.0911 1.5031 1446 0.3146 581704
0.1042 1.7536 1687 0.2114 677576
0.096 2.0042 1928 0.3562 775312
0.0094 2.2547 2169 0.3035 873104
0.0894 2.5052 2410 0.3649 969360
0.0705 2.7557 2651 0.3061 1065232
0.0016 3.0062 2892 0.2698 1162016
0.0469 3.2568 3133 0.3603 1259168
0.0682 3.5073 3374 0.4128 1355552
0.0128 3.7578 3615 0.3697 1453088
0.0238 4.0083 3856 0.3716 1549360
0.0 4.2588 4097 0.4492 1645808
0.0202 4.5094 4338 0.4368 1742960
0.0001 4.7599 4579 0.4381 1839344

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4