Files
tadabur-Whisper-Small/README.md
ModelHub XC 1dd146187c 初始化项目,由ModelHub XC社区提供模型
Model: FaisaI/tadabur-Whisper-Small
Source: Original Platform
2026-05-08 16:19:43 +08:00

118 lines
3.6 KiB
Markdown

---
base_model:
- openai/whisper-small
datasets:
- FaisaI/tadabur
language:
- ar
license: cc-by-nc-4.0
metrics:
- wer
pipeline_tag: automatic-speech-recognition
library_name: transformers
tags:
- quran
- asr
- arabic
- speech-recognition
---
<div align="center">
<img src="https://huggingface.co/datasets/FaisaI/tadabur/resolve/main/tadabur_logo.png" width="100"><br><br>
<h1>Tadabur-Whisper-Small</h1>
A Whisper Small model fine-tuned on [Tadabur](https://huggingface.co/datasets/FaisaI/tadabur) for Qur'anic speech recognition.
[![Paper](https://img.shields.io/badge/Paper-Read-a27b5c?style=flat-square)](https://huggingface.co/papers/2604.18932)
[![Dataset](https://img.shields.io/badge/🤗_Dataset-FaisaI%2Ftadabur-c8a97a?style=flat-square)](https://huggingface.co/datasets/FaisaI/tadabur)
[![Base Model](https://img.shields.io/badge/Base-Whisper_Small-1c1f1e?style=flat-square)](https://huggingface.co/openai/whisper-small)
[![License](https://img.shields.io/badge/License-CC_BY--NC_4.0-e6ddd0?style=flat-square)](https://creativecommons.org/licenses/by-nc/4.0/)
[![Page](https://img.shields.io/badge/🌐_Project_Page-tadabur-a27b5c?style=flat-square)](https://fherran.github.io/tadabur)
</div>
---
## Overview
**Tadabur-Whisper-Small** is a fine-tuned version of [Whisper Small](https://huggingface.co/openai/whisper-small) on the [Tadabur dataset](https://huggingface.co/datasets/FaisaI/tadabur), as presented in the paper [Tadabur: A Large-Scale Quran Audio Dataset](https://huggingface.co/papers/2604.18932).
- **GitHub Repository:** [fherran/tadabur](https://github.com/fherran/tadabur)
- **Project Page:** [fherran.github.io/tadabur](https://fherran.github.io/tadabur)
---
## Training Iteration
| Step | Epoch | WER ↓ |
|:---:|:---:|:---:|
| 2,500 | 0.15 | 13.78% |
| 5,000 | 0.30 | 11.20% |
| 7,500 | 0.44 | 11.15% |
| 25,000 | 1.48 | **7.89%** ⭐ |
| 32,500 | 1.93 | 14.75% |
---
## Usage
```python
from transformers import pipeline
asr = pipeline(
"automatic-speech-recognition",
model="FaisaI/tadabur-whisper-small",
generate_kwargs={"language": "arabic"}
)
result = asr("path/to/audiofile")
print(result["text"])
```
Or with the full Whisper API:
```python
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa
processor = WhisperProcessor.from_pretrained("FaisaI/tadabur-whisper-small")
model = WhisperForConditionalGeneration.from_pretrained("FaisaI/tadabur-whisper-small")
# Audio must be 16kHz mono
audio_array, sampling_rate = librosa.load("path/to/audiofile", sr=16000,mono=True)
inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt")
predicted_ids = model.generate(**inputs, language="arabic")
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print(transcription[0])
```
---
## Limitations
- Not suitable for speaker identification or diarization.
- May underperform on noisy or low-quality recordings.
- Not fully generalized — transcription errors are expected.
---
## Ethical Considerations
This model is trained exclusively on Qur'anic recitation data. Users must engage with outputs respectfully and must not use this model for mockery, distortion, or any disrespectful application involving Qur'anic content.
**For research and educational use only.**
---
## Citation
```bibtex
@misc{alherran2026tadabur,
author = {Alherran, Faisal},
title = {Tadabur: A Large-Scale Quran Audio Dataset},
year = {2026},
eprint = {2604.18932},
archivePrefix = {arXiv},
primaryClass = {cs.SD},
doi = {10.48550/arXiv.2604.18932},
url = {https://arxiv.org/abs/2604.18932}
}
```