library_name, base_model, tags, model-index
library_name base_model tags model-index
transformers CausalNLP/gpt2-hf_multilingual-20
generated_from_trainer
name results
gpt2-multilingual-20-arabic-repair_3epochs_lr1e-4_bf16

gpt2-multilingual-20-arabic-repair_3epochs_lr1e-4_bf16

This model is a fine-tuned version of CausalNLP/gpt2-hf_multilingual-20 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2206

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
3.295 0.0799 500 3.3488
3.2752 0.1597 1000 3.3363
3.2844 0.2396 1500 3.3416
3.3059 0.3194 2000 3.3491
3.335 0.3993 2500 3.3387
3.2845 0.4791 3000 3.3301
3.2458 0.5590 3500 3.3216
3.2689 0.6388 4000 3.3129
3.2737 0.7187 4500 3.3069
3.2604 0.7985 5000 3.3000
3.2255 0.8784 5500 3.2936
3.239 0.9582 6000 3.2882
3.2594 1.0380 6500 3.2832
3.2123 1.1179 7000 3.2780
3.2347 1.1977 7500 3.2726
3.2197 1.2776 8000 3.2671
3.2028 1.3574 8500 3.2615
3.195 1.4373 9000 3.2557
3.2049 1.5171 9500 3.2510
3.1649 1.5970 10000 3.2458
3.1984 1.6768 10500 3.2409
3.2787 1.7567 11000 3.2374
3.1966 1.8365 11500 3.2334
3.166 1.9164 12000 3.2305
3.2029 1.9962 12500 3.2272
3.1629 2.0760 13000 3.2268
3.1115 2.1559 13500 3.2253
3.184 2.2357 14000 3.2239
3.1391 2.3156 14500 3.2229
3.1341 2.3954 15000 3.2220
3.1623 2.4753 15500 3.2214
3.1565 2.5551 16000 3.2211
3.1129 2.6350 16500 3.2208
3.2043 2.7148 17000 3.2207
3.1044 2.7947 17500 3.2206
3.1922 2.8746 18000 3.2206
3.1229 2.9544 18500 3.2206

Framework versions

  • Transformers 4.57.4
  • Pytorch 2.9.1+cu128
  • Datasets 4.4.2
  • Tokenizers 0.22.2
Description
Model synced from source: CausalNLP/gpt2-multilingual-20-arabic-repair_3epochs_lr1e-4_bf16
Readme 2.2 MiB