ModelHub XC e1256e52a7 初始化项目,由ModelHub XC社区提供模型
Model: rbelanec/train_qnli_42_1779286681
Source: Original Platform
2026-06-04 16:19:25 +08:00

library_name, license, base_model, tags, model-index
library_name license base_model tags model-index
transformers llama3.2 meta-llama/Llama-3.2-1B-Instruct
peft-factory
full
llama-factory
generated_from_trainer
name results
train_qnli_42_1779286681

train_qnli_42_1779286681

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0522
  • Num Input Tokens Seen: 11312256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.093 0.0501 590 0.0805 571072
0.0911 0.1001 1180 0.0656 1136384
0.1636 0.1502 1770 0.0790 1703808
0.138 0.2003 2360 0.0773 2266496
0.0654 0.2503 2950 0.0821 2827328
0.0218 0.3004 3540 0.0740 3399808
0.0142 0.3505 4130 0.0708 3963584
0.0847 0.4005 4720 0.0573 4530304
0.0674 0.4506 5310 0.0652 5095424
0.0262 0.5007 5900 0.0528 5660352
0.0577 0.5507 6490 0.0573 6232896
0.042 0.6008 7080 0.0625 6801984
0.0449 0.6509 7670 0.0570 7363968
0.0589 0.7010 8260 0.0540 7924800
0.0597 0.7510 8850 0.0540 8494720
0.1148 0.8011 9440 0.0544 9066048
0.0761 0.8512 10030 0.0528 9634624
0.0919 0.9012 10620 0.0522 10199424
0.0039 0.9513 11210 0.0523 10764096

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Description
Model synced from source: rbelanec/train_qnli_42_1779286681
Readme 17 MiB