初始化项目，由ModelHub XC社区提供模型

Model: SiangLao/xlsr-53-lao-asr Source: Original Platform
2026-05-08 11:40:38 +08:00
commit e076a20f8f
10 changed files with 417 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,80 @@
+---
+language: lo
+license: apache-2.0
+tags:
+- automatic-speech-recognition
+- speech
+- audio
+- lao
+- wav2vec2
+- xlsr
+datasets:
+- SiangLao/lao-asr-thesis-dataset
+metrics:
+- cer
+base_model:
+- facebook/wav2vec2-large-xlsr-53
+library_name: transformers
+---
+
+# XLSR-53 Lao ASR
+
+Fine-tuned XLSR-53 model for Lao automatic speech recognition, achieving 16.22% CER on test data.
+
+## Model Details
+
+This model is fine-tuned from [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) using the SiangLao/lao-asr-thesis-dataset.
+
+### Training Configuration
+- **Epochs**: 15
+- **Batch Size**: 16  
+- **Learning Rate**: 1e-4
+- **Training Date**: June 3, 2025
+- **Vocabulary Size**: 55 Lao characters + special tokens
+
+### Performance
+
+| Split | CER | Loss |
+|-------|-----|------|
+| Test | 16.22% | 0.419 |
+| Validation | 16.52% | 0.487 |
+
+## Usage
+
+```python
+from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
+import torch
+import librosa
+
+# Load model and processor
+model = Wav2Vec2ForCTC.from_pretrained("SiangLao/xlsr-53-lao-asr")
+processor = Wav2Vec2Processor.from_pretrained("SiangLao/xlsr-53-lao-asr")
+
+# Load audio (must be 16kHz)
+audio, sr = librosa.load("audio.wav", sr=16000)
+
+# Process audio
+inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
+
+# Generate prediction
+with torch.no_grad():
+    logits = model(**inputs).logits
+    predicted_ids = torch.argmax(logits, dim=-1)
+    transcription = processor.batch_decode(predicted_ids)[0]
+
+# Clean transcription
+transcription = transcription.replace("<unk>", " ").strip()
+
+print(transcription)
+```
+
+## Citation
+```bibtex
+@thesis{naovalath2025lao,
+  title={Lao Automatic Speech Recognition using Transfer Learning},
+  author={Souphaxay Naovalath and Sounmy Chanthavong},
+  advisor={Dr. Somsack Inthasone},
+  school={National University of Laos, Faculty of Natural Sciences, Computer Science Department},
+  year={2025}
+}
+```