初始化项目,由ModelHub XC社区提供模型
Model: namphungdn134/whisper-small-vi Source: Original Platform
This commit is contained in:
147
README.md
Normal file
147
README.md
Normal file
@@ -0,0 +1,147 @@
|
|||||||
|
---
|
||||||
|
library_name: transformers
|
||||||
|
language:
|
||||||
|
- vi
|
||||||
|
license: mit
|
||||||
|
base_model: openai/whisper-small
|
||||||
|
tags:
|
||||||
|
- generated_from_trainer
|
||||||
|
- Speech_to_text
|
||||||
|
- audio2text
|
||||||
|
- S2T
|
||||||
|
- STT
|
||||||
|
|
||||||
|
metrics:
|
||||||
|
- wer
|
||||||
|
model-index:
|
||||||
|
- name: Whisper Small Vi - Nam Phung
|
||||||
|
results: []
|
||||||
|
pipeline_tag: automatic-speech-recognition
|
||||||
|
---
|
||||||
|
|
||||||
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
||||||
|
should probably proofread and complete it, then remove this comment. -->
|
||||||
|
|
||||||
|
|
||||||
|
# Whisper Small Vi V1.1: Whisper Small for Vietnamese Fine-Tuned by Nam Phung 🚀
|
||||||
|
|
||||||
|
## 📝 Introduction
|
||||||
|
|
||||||
|
This is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) model on Vietnamese speech data. The model aims to improve transcription accuracy and robustness for Vietnamese automatic speech recognition (ASR) tasks, especially in real-world scenarios.
|
||||||
|
|
||||||
|
## 📊 Fine-tuning Results
|
||||||
|
|
||||||
|
- **Word Error Rate (WER)**: 9.3485
|
||||||
|
<!-- training_steps: 50000 -->
|
||||||
|
|
||||||
|
> Evaluation was performed on a held-out test set with diverse regional accents and speaking styles.
|
||||||
|
|
||||||
|
## 📝 Model Description
|
||||||
|
|
||||||
|
The Whisper small model is a transformer-small sequence-to-sequence model designed for automatic speech recognition and translation tasks. It has been trained on over 680,000 hours of labeled audio data in multiple languages. The fine-tuned version of this model focuses on the Vietnamese language, aiming to improve transcription accuracy and handling of local dialects.
|
||||||
|
|
||||||
|
This model works with the WhisperProcessor to pre-process audio inputs into log-Mel spectrograms and decode them into text.
|
||||||
|
|
||||||
|
## 📁 Dataset
|
||||||
|
|
||||||
|
- Total Duration: More 250 hours of high-quality Vietnamese speech data
|
||||||
|
- Sources: Public Vietnamese datasets
|
||||||
|
- Format: 16kHz WAV files with corresponding text transcripts
|
||||||
|
- Preprocessing: Audio was normalized and segmented. Transcripts were cleaned and tokenized.
|
||||||
|
|
||||||
|
## 🚀 How to Use
|
||||||
|
|
||||||
|
To use the fine-tuned model, you can follow these steps:
|
||||||
|
|
||||||
|
1. Install the required dependencies:
|
||||||
|
```python
|
||||||
|
# Install required libraries
|
||||||
|
!pip install transformers torch librosa soundfile --quiet
|
||||||
|
|
||||||
|
# Import necessary libraries
|
||||||
|
import torch
|
||||||
|
import librosa
|
||||||
|
import soundfile as sf
|
||||||
|
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
|
||||||
|
|
||||||
|
print("Environment setup completed!")
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Use the model for inference:
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
import librosa
|
||||||
|
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
|
||||||
|
|
||||||
|
device = "cuda" if torch.cuda.is_available() else "cpu"
|
||||||
|
print(f"Using device: {device}")
|
||||||
|
|
||||||
|
# Load processor and model
|
||||||
|
model_id = "namphungdn134/whisper-small-vi"
|
||||||
|
print(f"Loading model from: {model_id}")
|
||||||
|
processor = AutoProcessor.from_pretrained(model_id)
|
||||||
|
model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id).to(device)
|
||||||
|
|
||||||
|
# config language and task
|
||||||
|
forced_decoder_ids = processor.get_decoder_prompt_ids(language="vi", task="transcribe")
|
||||||
|
model.config.forced_decoder_ids = forced_decoder_ids
|
||||||
|
print(f"Forced decoder IDs for Vietnamese: {forced_decoder_ids}")
|
||||||
|
|
||||||
|
# Preprocess
|
||||||
|
audio_path = "example.wav"
|
||||||
|
print(f"Loading audio from: {audio_path}")
|
||||||
|
audio, sr = librosa.load(audio_path, sr=16000)
|
||||||
|
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features.to(device)
|
||||||
|
print(f"Input features shape: {input_features.shape}")
|
||||||
|
|
||||||
|
# Generate
|
||||||
|
print("Generating transcription...")
|
||||||
|
with torch.no_grad():
|
||||||
|
predicted_ids = model.generate(input_features, forced_decoder_ids=forced_decoder_ids)
|
||||||
|
|
||||||
|
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
|
||||||
|
print("📝 Transcription:", transcription)
|
||||||
|
|
||||||
|
# Debug: Print token to check
|
||||||
|
print("Predicted IDs:", predicted_ids[0].tolist())
|
||||||
|
```
|
||||||
|
|
||||||
|
## ⚠️ Limitations
|
||||||
|
|
||||||
|
- This model is specifically fine-tuned for the Vietnamese language. It might not perform well on other languages.
|
||||||
|
- Struggles with overlapping speech or noisy background.
|
||||||
|
- Performance may drop with strong dialectal variations not well represented in training data.
|
||||||
|
|
||||||
|
## 📄 License
|
||||||
|
|
||||||
|
This model is licensed under the [MIT License](LICENSE).
|
||||||
|
|
||||||
|
## 📚 Citation
|
||||||
|
|
||||||
|
If you use this model in your research or application, please cite the original Whisper model and this fine-tuning work as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
@article{Whisper2021,
|
||||||
|
title={Whisper: A Multilingual Speech Recognition Model},
|
||||||
|
author={OpenAI},
|
||||||
|
year={2021},
|
||||||
|
journal={arXiv:2202.12064},
|
||||||
|
url={https://arxiv.org/abs/2202.12064}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
|
@misc{title={Whisper small Vi V1.1 - Nam Phung},
|
||||||
|
author={Nam Phùng},
|
||||||
|
organization={DUT},
|
||||||
|
year={2025},
|
||||||
|
url={https://huggingface.co/namphungdn134/whisper-small-vi},
|
||||||
|
url={https://github.com/namphung134/ASR-Vietnamese}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📬 Contact
|
||||||
|
|
||||||
|
For questions, collaborations, or suggestions, feel free to reach out via [namphungdn134@gmail.com].
|
||||||
1609
added_tokens.json
Normal file
1609
added_tokens.json
Normal file
File diff suppressed because it is too large
Load Diff
60
config.json
Normal file
60
config.json
Normal file
@@ -0,0 +1,60 @@
|
|||||||
|
{
|
||||||
|
"activation_dropout": 0.0,
|
||||||
|
"activation_function": "gelu",
|
||||||
|
"apply_spec_augment": false,
|
||||||
|
"architectures": [
|
||||||
|
"WhisperForConditionalGeneration"
|
||||||
|
],
|
||||||
|
"attention_dropout": 0.0,
|
||||||
|
"begin_suppress_tokens": null,
|
||||||
|
"bos_token_id": 50257,
|
||||||
|
"classifier_proj_size": 256,
|
||||||
|
"d_model": 768,
|
||||||
|
"decoder_attention_heads": 12,
|
||||||
|
"decoder_ffn_dim": 3072,
|
||||||
|
"decoder_layerdrop": 0.0,
|
||||||
|
"decoder_layers": 12,
|
||||||
|
"decoder_start_token_id": 50258,
|
||||||
|
"dropout": 0.0,
|
||||||
|
"encoder_attention_heads": 12,
|
||||||
|
"encoder_ffn_dim": 3072,
|
||||||
|
"encoder_layerdrop": 0.0,
|
||||||
|
"encoder_layers": 12,
|
||||||
|
"eos_token_id": 50257,
|
||||||
|
"forced_decoder_ids": [
|
||||||
|
[
|
||||||
|
1,
|
||||||
|
50259
|
||||||
|
],
|
||||||
|
[
|
||||||
|
2,
|
||||||
|
50359
|
||||||
|
],
|
||||||
|
[
|
||||||
|
3,
|
||||||
|
50363
|
||||||
|
]
|
||||||
|
],
|
||||||
|
"init_std": 0.02,
|
||||||
|
"is_encoder_decoder": true,
|
||||||
|
"mask_feature_length": 10,
|
||||||
|
"mask_feature_min_masks": 0,
|
||||||
|
"mask_feature_prob": 0.0,
|
||||||
|
"mask_time_length": 10,
|
||||||
|
"mask_time_min_masks": 2,
|
||||||
|
"mask_time_prob": 0.05,
|
||||||
|
"max_length": null,
|
||||||
|
"max_source_positions": 1500,
|
||||||
|
"max_target_positions": 448,
|
||||||
|
"median_filter_width": 7,
|
||||||
|
"model_type": "whisper",
|
||||||
|
"num_hidden_layers": 12,
|
||||||
|
"num_mel_bins": 80,
|
||||||
|
"pad_token_id": 50257,
|
||||||
|
"scale_embedding": false,
|
||||||
|
"torch_dtype": "float32",
|
||||||
|
"transformers_version": "4.51.3",
|
||||||
|
"use_cache": true,
|
||||||
|
"use_weighted_layer_sum": false,
|
||||||
|
"vocab_size": 51865
|
||||||
|
}
|
||||||
254
generation_config.json
Normal file
254
generation_config.json
Normal file
@@ -0,0 +1,254 @@
|
|||||||
|
{
|
||||||
|
"alignment_heads": [
|
||||||
|
[
|
||||||
|
5,
|
||||||
|
3
|
||||||
|
],
|
||||||
|
[
|
||||||
|
5,
|
||||||
|
9
|
||||||
|
],
|
||||||
|
[
|
||||||
|
8,
|
||||||
|
0
|
||||||
|
],
|
||||||
|
[
|
||||||
|
8,
|
||||||
|
4
|
||||||
|
],
|
||||||
|
[
|
||||||
|
8,
|
||||||
|
7
|
||||||
|
],
|
||||||
|
[
|
||||||
|
8,
|
||||||
|
8
|
||||||
|
],
|
||||||
|
[
|
||||||
|
9,
|
||||||
|
0
|
||||||
|
],
|
||||||
|
[
|
||||||
|
9,
|
||||||
|
7
|
||||||
|
],
|
||||||
|
[
|
||||||
|
9,
|
||||||
|
9
|
||||||
|
],
|
||||||
|
[
|
||||||
|
10,
|
||||||
|
5
|
||||||
|
]
|
||||||
|
],
|
||||||
|
"begin_suppress_tokens": [
|
||||||
|
220,
|
||||||
|
50257
|
||||||
|
],
|
||||||
|
"bos_token_id": 50257,
|
||||||
|
"decoder_start_token_id": 50258,
|
||||||
|
"eos_token_id": 50257,
|
||||||
|
"is_multilingual": true,
|
||||||
|
"lang_to_id": {
|
||||||
|
"<|af|>": 50327,
|
||||||
|
"<|am|>": 50334,
|
||||||
|
"<|ar|>": 50272,
|
||||||
|
"<|as|>": 50350,
|
||||||
|
"<|az|>": 50304,
|
||||||
|
"<|ba|>": 50355,
|
||||||
|
"<|be|>": 50330,
|
||||||
|
"<|bg|>": 50292,
|
||||||
|
"<|bn|>": 50302,
|
||||||
|
"<|bo|>": 50347,
|
||||||
|
"<|br|>": 50309,
|
||||||
|
"<|bs|>": 50315,
|
||||||
|
"<|ca|>": 50270,
|
||||||
|
"<|cs|>": 50283,
|
||||||
|
"<|cy|>": 50297,
|
||||||
|
"<|da|>": 50285,
|
||||||
|
"<|de|>": 50261,
|
||||||
|
"<|el|>": 50281,
|
||||||
|
"<|en|>": 50259,
|
||||||
|
"<|es|>": 50262,
|
||||||
|
"<|et|>": 50307,
|
||||||
|
"<|eu|>": 50310,
|
||||||
|
"<|fa|>": 50300,
|
||||||
|
"<|fi|>": 50277,
|
||||||
|
"<|fo|>": 50338,
|
||||||
|
"<|fr|>": 50265,
|
||||||
|
"<|gl|>": 50319,
|
||||||
|
"<|gu|>": 50333,
|
||||||
|
"<|haw|>": 50352,
|
||||||
|
"<|ha|>": 50354,
|
||||||
|
"<|he|>": 50279,
|
||||||
|
"<|hi|>": 50276,
|
||||||
|
"<|hr|>": 50291,
|
||||||
|
"<|ht|>": 50339,
|
||||||
|
"<|hu|>": 50286,
|
||||||
|
"<|hy|>": 50312,
|
||||||
|
"<|id|>": 50275,
|
||||||
|
"<|is|>": 50311,
|
||||||
|
"<|it|>": 50274,
|
||||||
|
"<|ja|>": 50266,
|
||||||
|
"<|jw|>": 50356,
|
||||||
|
"<|ka|>": 50329,
|
||||||
|
"<|kk|>": 50316,
|
||||||
|
"<|km|>": 50323,
|
||||||
|
"<|kn|>": 50306,
|
||||||
|
"<|ko|>": 50264,
|
||||||
|
"<|la|>": 50294,
|
||||||
|
"<|lb|>": 50345,
|
||||||
|
"<|ln|>": 50353,
|
||||||
|
"<|lo|>": 50336,
|
||||||
|
"<|lt|>": 50293,
|
||||||
|
"<|lv|>": 50301,
|
||||||
|
"<|mg|>": 50349,
|
||||||
|
"<|mi|>": 50295,
|
||||||
|
"<|mk|>": 50308,
|
||||||
|
"<|ml|>": 50296,
|
||||||
|
"<|mn|>": 50314,
|
||||||
|
"<|mr|>": 50320,
|
||||||
|
"<|ms|>": 50282,
|
||||||
|
"<|mt|>": 50343,
|
||||||
|
"<|my|>": 50346,
|
||||||
|
"<|ne|>": 50313,
|
||||||
|
"<|nl|>": 50271,
|
||||||
|
"<|nn|>": 50342,
|
||||||
|
"<|no|>": 50288,
|
||||||
|
"<|oc|>": 50328,
|
||||||
|
"<|pa|>": 50321,
|
||||||
|
"<|pl|>": 50269,
|
||||||
|
"<|ps|>": 50340,
|
||||||
|
"<|pt|>": 50267,
|
||||||
|
"<|ro|>": 50284,
|
||||||
|
"<|ru|>": 50263,
|
||||||
|
"<|sa|>": 50344,
|
||||||
|
"<|sd|>": 50332,
|
||||||
|
"<|si|>": 50322,
|
||||||
|
"<|sk|>": 50298,
|
||||||
|
"<|sl|>": 50305,
|
||||||
|
"<|sn|>": 50324,
|
||||||
|
"<|so|>": 50326,
|
||||||
|
"<|sq|>": 50317,
|
||||||
|
"<|sr|>": 50303,
|
||||||
|
"<|su|>": 50357,
|
||||||
|
"<|sv|>": 50273,
|
||||||
|
"<|sw|>": 50318,
|
||||||
|
"<|ta|>": 50287,
|
||||||
|
"<|te|>": 50299,
|
||||||
|
"<|tg|>": 50331,
|
||||||
|
"<|th|>": 50289,
|
||||||
|
"<|tk|>": 50341,
|
||||||
|
"<|tl|>": 50348,
|
||||||
|
"<|tr|>": 50268,
|
||||||
|
"<|tt|>": 50351,
|
||||||
|
"<|uk|>": 50280,
|
||||||
|
"<|ur|>": 50290,
|
||||||
|
"<|uz|>": 50337,
|
||||||
|
"<|vi|>": 50278,
|
||||||
|
"<|yi|>": 50335,
|
||||||
|
"<|yo|>": 50325,
|
||||||
|
"<|zh|>": 50260
|
||||||
|
},
|
||||||
|
"language": "vietnamese",
|
||||||
|
"max_initial_timestamp_index": 50,
|
||||||
|
"max_length": 448,
|
||||||
|
"no_timestamps_token_id": 50363,
|
||||||
|
"pad_token_id": 50257,
|
||||||
|
"prev_sot_token_id": 50361,
|
||||||
|
"return_timestamps": false,
|
||||||
|
"suppress_tokens": [
|
||||||
|
1,
|
||||||
|
2,
|
||||||
|
7,
|
||||||
|
8,
|
||||||
|
9,
|
||||||
|
10,
|
||||||
|
14,
|
||||||
|
25,
|
||||||
|
26,
|
||||||
|
27,
|
||||||
|
28,
|
||||||
|
29,
|
||||||
|
31,
|
||||||
|
58,
|
||||||
|
59,
|
||||||
|
60,
|
||||||
|
61,
|
||||||
|
62,
|
||||||
|
63,
|
||||||
|
90,
|
||||||
|
91,
|
||||||
|
92,
|
||||||
|
93,
|
||||||
|
359,
|
||||||
|
503,
|
||||||
|
522,
|
||||||
|
542,
|
||||||
|
873,
|
||||||
|
893,
|
||||||
|
902,
|
||||||
|
918,
|
||||||
|
922,
|
||||||
|
931,
|
||||||
|
1350,
|
||||||
|
1853,
|
||||||
|
1982,
|
||||||
|
2460,
|
||||||
|
2627,
|
||||||
|
3246,
|
||||||
|
3253,
|
||||||
|
3268,
|
||||||
|
3536,
|
||||||
|
3846,
|
||||||
|
3961,
|
||||||
|
4183,
|
||||||
|
4667,
|
||||||
|
6585,
|
||||||
|
6647,
|
||||||
|
7273,
|
||||||
|
9061,
|
||||||
|
9383,
|
||||||
|
10428,
|
||||||
|
10929,
|
||||||
|
11938,
|
||||||
|
12033,
|
||||||
|
12331,
|
||||||
|
12562,
|
||||||
|
13793,
|
||||||
|
14157,
|
||||||
|
14635,
|
||||||
|
15265,
|
||||||
|
15618,
|
||||||
|
16553,
|
||||||
|
16604,
|
||||||
|
18362,
|
||||||
|
18956,
|
||||||
|
20075,
|
||||||
|
21675,
|
||||||
|
22520,
|
||||||
|
26130,
|
||||||
|
26161,
|
||||||
|
26435,
|
||||||
|
28279,
|
||||||
|
29464,
|
||||||
|
31650,
|
||||||
|
32302,
|
||||||
|
32470,
|
||||||
|
36865,
|
||||||
|
42863,
|
||||||
|
47425,
|
||||||
|
49870,
|
||||||
|
50254,
|
||||||
|
50258,
|
||||||
|
50360,
|
||||||
|
50361,
|
||||||
|
50362
|
||||||
|
],
|
||||||
|
"task": "transcribe",
|
||||||
|
"task_to_id": {
|
||||||
|
"transcribe": 50359,
|
||||||
|
"translate": 50358
|
||||||
|
},
|
||||||
|
"transformers_version": "4.51.3"
|
||||||
|
}
|
||||||
35
gitattributes
Normal file
35
gitattributes
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
50001
merges.txt
Normal file
50001
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
model.safetensors
Normal file
3
model.safetensors
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:f4989a040091477cc37eb99658d5f7e104da5fcf31752262671fbd748e66bad2
|
||||||
|
size 966995080
|
||||||
1742
normalizer.json
Normal file
1742
normalizer.json
Normal file
File diff suppressed because it is too large
Load Diff
15
preprocessor_config.json
Normal file
15
preprocessor_config.json
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
{
|
||||||
|
"chunk_length": 30,
|
||||||
|
"dither": 0.0,
|
||||||
|
"feature_extractor_type": "WhisperFeatureExtractor",
|
||||||
|
"feature_size": 80,
|
||||||
|
"hop_length": 160,
|
||||||
|
"n_fft": 400,
|
||||||
|
"n_samples": 480000,
|
||||||
|
"nb_max_frames": 3000,
|
||||||
|
"padding_side": "right",
|
||||||
|
"padding_value": 0.0,
|
||||||
|
"processor_class": "WhisperProcessor",
|
||||||
|
"return_attention_mask": false,
|
||||||
|
"sampling_rate": 16000
|
||||||
|
}
|
||||||
139
special_tokens_map.json
Normal file
139
special_tokens_map.json
Normal file
@@ -0,0 +1,139 @@
|
|||||||
|
{
|
||||||
|
"additional_special_tokens": [
|
||||||
|
"<|endoftext|>",
|
||||||
|
"<|startoftranscript|>",
|
||||||
|
"<|en|>",
|
||||||
|
"<|zh|>",
|
||||||
|
"<|de|>",
|
||||||
|
"<|es|>",
|
||||||
|
"<|ru|>",
|
||||||
|
"<|ko|>",
|
||||||
|
"<|fr|>",
|
||||||
|
"<|ja|>",
|
||||||
|
"<|pt|>",
|
||||||
|
"<|tr|>",
|
||||||
|
"<|pl|>",
|
||||||
|
"<|ca|>",
|
||||||
|
"<|nl|>",
|
||||||
|
"<|ar|>",
|
||||||
|
"<|sv|>",
|
||||||
|
"<|it|>",
|
||||||
|
"<|id|>",
|
||||||
|
"<|hi|>",
|
||||||
|
"<|fi|>",
|
||||||
|
"<|vi|>",
|
||||||
|
"<|he|>",
|
||||||
|
"<|uk|>",
|
||||||
|
"<|el|>",
|
||||||
|
"<|ms|>",
|
||||||
|
"<|cs|>",
|
||||||
|
"<|ro|>",
|
||||||
|
"<|da|>",
|
||||||
|
"<|hu|>",
|
||||||
|
"<|ta|>",
|
||||||
|
"<|no|>",
|
||||||
|
"<|th|>",
|
||||||
|
"<|ur|>",
|
||||||
|
"<|hr|>",
|
||||||
|
"<|bg|>",
|
||||||
|
"<|lt|>",
|
||||||
|
"<|la|>",
|
||||||
|
"<|mi|>",
|
||||||
|
"<|ml|>",
|
||||||
|
"<|cy|>",
|
||||||
|
"<|sk|>",
|
||||||
|
"<|te|>",
|
||||||
|
"<|fa|>",
|
||||||
|
"<|lv|>",
|
||||||
|
"<|bn|>",
|
||||||
|
"<|sr|>",
|
||||||
|
"<|az|>",
|
||||||
|
"<|sl|>",
|
||||||
|
"<|kn|>",
|
||||||
|
"<|et|>",
|
||||||
|
"<|mk|>",
|
||||||
|
"<|br|>",
|
||||||
|
"<|eu|>",
|
||||||
|
"<|is|>",
|
||||||
|
"<|hy|>",
|
||||||
|
"<|ne|>",
|
||||||
|
"<|mn|>",
|
||||||
|
"<|bs|>",
|
||||||
|
"<|kk|>",
|
||||||
|
"<|sq|>",
|
||||||
|
"<|sw|>",
|
||||||
|
"<|gl|>",
|
||||||
|
"<|mr|>",
|
||||||
|
"<|pa|>",
|
||||||
|
"<|si|>",
|
||||||
|
"<|km|>",
|
||||||
|
"<|sn|>",
|
||||||
|
"<|yo|>",
|
||||||
|
"<|so|>",
|
||||||
|
"<|af|>",
|
||||||
|
"<|oc|>",
|
||||||
|
"<|ka|>",
|
||||||
|
"<|be|>",
|
||||||
|
"<|tg|>",
|
||||||
|
"<|sd|>",
|
||||||
|
"<|gu|>",
|
||||||
|
"<|am|>",
|
||||||
|
"<|yi|>",
|
||||||
|
"<|lo|>",
|
||||||
|
"<|uz|>",
|
||||||
|
"<|fo|>",
|
||||||
|
"<|ht|>",
|
||||||
|
"<|ps|>",
|
||||||
|
"<|tk|>",
|
||||||
|
"<|nn|>",
|
||||||
|
"<|mt|>",
|
||||||
|
"<|sa|>",
|
||||||
|
"<|lb|>",
|
||||||
|
"<|my|>",
|
||||||
|
"<|bo|>",
|
||||||
|
"<|tl|>",
|
||||||
|
"<|mg|>",
|
||||||
|
"<|as|>",
|
||||||
|
"<|tt|>",
|
||||||
|
"<|haw|>",
|
||||||
|
"<|ln|>",
|
||||||
|
"<|ha|>",
|
||||||
|
"<|ba|>",
|
||||||
|
"<|jw|>",
|
||||||
|
"<|su|>",
|
||||||
|
"<|translate|>",
|
||||||
|
"<|transcribe|>",
|
||||||
|
"<|startoflm|>",
|
||||||
|
"<|startofprev|>",
|
||||||
|
"<|nocaptions|>",
|
||||||
|
"<|notimestamps|>"
|
||||||
|
],
|
||||||
|
"bos_token": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"eos_token": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"pad_token": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"unk_token": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
}
|
||||||
|
}
|
||||||
12990
tokenizer_config.json
Normal file
12990
tokenizer_config.json
Normal file
File diff suppressed because it is too large
Load Diff
BIN
training_args.bin
Normal file
BIN
training_args.bin
Normal file
Binary file not shown.
50260
vocab.json
Normal file
50260
vocab.json
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user