Files
ModelHub XC 28deb980b6 初始化项目,由ModelHub XC社区提供模型
Model: NUTN-KWS/Whisper-Taiwanese-model-v0.5
Source: Original Platform
2026-05-13 02:11:35 +08:00

61 lines
3.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
license: cc-by-nc-4.0
language:
- en
- zh
metrics:
- cer
pipeline_tag: automatic-speech-recognition
base_model:
- openai/whisper-large-v3-turbo
---
[ [繁體中文 README.md](https://huggingface.co/NUTN-KWS/Whisper-Taiwanese-model-v0.5) ]
# 👳 Whisper-Taiwanese model V0.5 (Tv0.5)
This model is a fine-tuned version of OpenAIs [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo). It was developed by the National University of Tainan (NUTN), Taiwan, as part of a National Science and Technology Council (NSTC)-funded industry-academia collaboration project. We carried out the Taiwanese-English Co-Learning Pilot Project from September 2024 to June 2025 in collaboration with JEN-PIN ENTERPRISE CO., LTD. The model is trained for Taiwanese language recognition tasks using JEN-PIN educational materials generated through StudentMachine Co-Learning during the Fall 2024 semester. Additionally, the NUTN is collaborating with the National Center for High-performance Computing (NCHC) of the National Applied Research Laboratories (NARLabs) in Taiwan to provide computational and storage resources and co-develop an AI learning model for elementary and high school students.
Demo: [https://kws.oaselab.org/taigitong/](https://kws.oaselab.org/taigitong/)
## 📝 Model Details
- **Base Model**: `openai/whisper-large-v3-turbo`
- **Fine-tuned for**: Taiwanese Hokkien Automatic Speech Recognition (ASR)
- **Fine-tuning Framework**: Hugging Face Transformers
- **Training Duration**: Approximately 180 hours using two V100 GPUs
- **Dataset**: Custom dataset, including the Dictionary of Frequently-Used Taiwanese Taigi released by the Ministry of Education, Taiwan, totaling approximately 90 hours of audio data.
- **Input Format**: 16kHz mono WAV
- **License**: CC BY-NC 4.0
## 🚀 Usage
### Installing Packages:
```bash
pip install torch torchvision torchaudio transformers
```
### Example:
```python
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="./model/whisper-taiwanese", device=0)
result = pipe("audio.wav", generate_kwargs={"language": "zh", "task": "transcribe"})
print(result["text"])
```
## 👨‍🎓 Citation
### BibTeX:
```bibtex
@misc{taiwanesewhisperasr2025,
title={Taiwanese Whisper ASR},
author={KWS Center, National University of Tainan, Taiwan},
year={2025},
url={https://huggingface.co/NUTN-KWS/Whisper-Taiwanese-model-v0.5}
}
```
### APA:
- C. S. Lee, M. H. Wang, C. C. Yue, G. Y. Teseng, and Y. Nojima, "Fuzzy Estimation Agent with Knowledge Graph and Quantum Fuzzy Inference Engine for Taiwanese-English Co-Learning," 2025 IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS 2025), Banff, Alberta, Canada, Aug. 16-19, 2025.
- C. S. Lee, M. H. Wang, C. Y. Chen, S. C. Yang, M. Reformat, N. Kubota, and A. Pourabdollah, "Integrating quantum CI and generative AI for Taiwanese/English co-learning," Quantum Machine Intelligence, vol. 6, 64, pp. 1-19, 2024.
- C. S. Lee, M. H. Wang, C. Y. Chen, S. C. Yang, M. Reformat, N. Kubota, and A. Pourabdollah, "Quantum fuzzy inference engine with generative AI and TAIDE KG for Taiwanese/English co-learning," 2025 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2025), Reims, France, Jul. 6-9, 2025.