ModelHub XC 89779fca34 初始化项目,由ModelHub XC社区提供模型
Model: rbelanec/train_qnli_42_1779207272
Source: Original Platform
2026-06-11 01:10:36 +08:00

library_name, license, base_model, tags, model-index
library_name license base_model tags model-index
transformers llama3.2 meta-llama/Llama-3.2-1B-Instruct
peft-factory
full
llama-factory
generated_from_trainer
name results
train_qnli_42_1779207272

train_qnli_42_1779207272

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0535
  • Num Input Tokens Seen: 56574368

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0827 0.25 2946 0.0783 2823680
0.0452 0.5 5892 0.0612 5652800
0.0473 0.75 8838 0.0545 8482944
0.0346 1.0 11784 0.0535 11312256
0.0299 1.25 14730 0.0764 14142784
0.0363 1.5 17676 0.0833 16969472
0.0084 1.75 20622 0.0826 19782400
0.0205 2.0 23568 0.0846 22629440
0.0001 2.25 26514 0.1074 25460032
0.0001 2.5 29460 0.1115 28284608
0.0001 2.75 32406 0.1112 31130432
0.023 3.0 35352 0.1149 33947392
0.0258 3.25 38298 0.1494 36783040
0.0 3.5 41244 0.1557 39604544
0.0 3.75 44190 0.1553 42421440
0.0 4.0 47136 0.1462 45265344
0.0 4.25 50082 0.1777 48098944
0.0 4.5 53028 0.1760 50906176
0.0 4.75 55974 0.1753 53746240
0.0 5.0 58920 0.1754 56574368

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Description
Model synced from source: rbelanec/train_qnli_42_1779207272
Readme 18 MiB