67 lines
1.7 KiB
Markdown
67 lines
1.7 KiB
Markdown
---
|
|
language:
|
|
- "tr"
|
|
thumbnail: "url_to_thumbnail"
|
|
tags:
|
|
- speech-recognition
|
|
- Turkish
|
|
- ASR
|
|
license: "apache-2.0"
|
|
datasets:
|
|
- common_voice
|
|
metrics:
|
|
- wer
|
|
- cer
|
|
base_model: "openai/whisper-large-v3"
|
|
---
|
|
|
|
# distil-whisper-large-v3-tr
|
|
|
|
## Model Description
|
|
|
|
`distil-whisper-large-v3-tr` is a distilled version of the Whisper model, fine-tuned for Turkish language tasks. This model has been trained and evaluated using a comprehensive dataset to achieve high accuracy in Turkish speech recognition.
|
|
|
|
## Training and Evaluation Metrics
|
|
|
|
The model was trained and evaluated using the `wandb` tool, with the following results:
|
|
|
|
### Evaluation Metrics
|
|
|
|
- **Cross-Entropy Loss (eval/ce_loss):** 0.53218
|
|
- **Epoch (eval/epoch):** 28
|
|
- **KL Loss (eval/kl_loss):** 0.34883
|
|
- **Total Loss (eval/loss):** 0.77457
|
|
- **Evaluation Time (eval/time):** 397.1784 seconds
|
|
- **Word Error Rate (eval/wer):** 14.43288%
|
|
- **Orthographic Word Error Rate (eval/wer_ortho):** 21.55298%
|
|
|
|
### Training Metrics
|
|
|
|
- **Cross-Entropy Loss (train/ce_loss):** 0.04695
|
|
- **Epoch (train/epoch):** 28
|
|
- **KL Loss (train/kl_loss):** 0.24143
|
|
- **Learning Rate (train/learning_rate):** 0.0001
|
|
- **Total Loss (train/loss):** 0.27899
|
|
- **Training Time (train/time):** 12426.92106 seconds
|
|
|
|
## Run History
|
|
|
|
### Overall Metrics
|
|
|
|
- **Real-Time Factor (all/rtf):** 392.23396
|
|
- **Word Error Rate (all/wer):** 14.33829
|
|
|
|
### Common Voice 17.0 Turkish Pseudo-Labelled Dataset
|
|
|
|
- **Real-Time Factor (common_voice_17_0_tr_pseudo_labelled/test/rtf):** 392.23396
|
|
- **Word Error Rate (common_voice_17_0_tr_pseudo_labelled/test/wer):** 14.33829
|
|
|
|
## Author
|
|
|
|
**Sercan Çepni**
|
|
Email: turkelf@gmail.com
|
|
|
|
---
|
|
|
|
For any questions or further information, please feel free to contact the author.
|