Files
whisper-large-v3-turbo-ar-m…/README.md
ModelHub XC 08116b76da 初始化项目,由ModelHub XC社区提供模型
Model: deepdml/whisper-large-v3-turbo-ar-mix-norm
Source: Original Platform
2026-05-12 12:23:34 +08:00

3.5 KiB

library_name, language, license, base_model, tags, datasets, metrics, model-index
library_name language license base_model tags datasets metrics model-index
transformers
ar
mit openai/whisper-large-v3-turbo
generated_from_trainer
fixie-ai/common_voice_17_0
google/fleurs
UBC-NLP/Casablanca
deepdml/Tunisian_MSA
ymoslem/MediaSpeech
wer
name results
Whisper Turbo ar
task dataset metrics
name type
Automatic Speech Recognition automatic-speech-recognition
name type
Common Voice 17.0 fixie-ai/common_voice_17_0
name type value
Wer wer 18.89976313325132

Whisper Turbo ar

This model is a fine-tuned version of openai/whisper-large-v3-turbo on the Common Voice 17.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1973
  • Wer: 18.8998
  • Cer: 5.0561

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.04
  • training_steps: 18000

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.5358 0.0556 1000 0.3047 26.8192 8.1187
0.3875 0.1111 2000 0.2829 27.2654 7.5340
0.2546 0.1667 3000 0.2629 24.2008 6.7543
0.1702 0.2222 4000 0.2628 23.4884 6.5769
0.1075 0.2778 5000 0.2584 23.9566 6.6370
0.0859 0.3333 6000 0.2569 24.5221 6.6761
0.06 0.3889 7000 0.2479 22.1828 6.1018
0.0539 0.4444 8000 0.2461 22.6143 6.2866
0.0427 0.5 9000 0.2402 23.1083 6.3401
0.0341 0.5556 10000 0.2356 22.2012 6.0513
0.0275 0.6111 11000 0.2338 20.7378 5.6669
0.0204 0.6667 12000 0.2296 21.1381 5.7997
0.0156 0.7222 13000 0.2324 21.9037 5.8359
0.0162 0.7778 14000 0.2214 20.4825 5.5345
0.0163 0.8333 15000 0.2131 21.0426 5.6430
0.0127 0.8889 16000 0.2093 19.5791 5.2782
0.006 0.9444 17000 0.2083 19.8197 5.2719
0.0072 1.0 18000 0.1973 18.8998 5.0561

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu121
  • Datasets 3.6.0
  • Tokenizers 0.21.0

Citation

Please cite the model using the following BibTeX entry:

@misc{deepdml/whisper-large-v3-turbo-ar-mix-norm,
      title={Fine-tuned Whisper turbo ASR model for speech recognition in Arabic},
      author={Jimenez, David},
      howpublished={\url{https://huggingface.co/deepdml/whisper-large-v3-turbo-ar-mix-norm}},
      year={2026}
    }