初始化项目,由ModelHub XC社区提供模型

Model: KasuleTrevor/cdli-whisper-ml-eng-lug-full-a40-5e-5
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-09 01:04:12 +08:00
commit 6076bb9561
35 changed files with 238670 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

106
README.md Normal file
View File

@@ -0,0 +1,106 @@
---
library_name: transformers
license: mit
base_model: Sunbird/asr-whisper-large-v3-salt
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: cdli-whisper-ml-eng-lug-full-a40-5e-5
results: []
datasets:
- cdli/ugandan_luganda_nonstandard_speech_v1.0
- cdli/ugandan_english_nonstandard_speech_v1.0
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# cdli-whisper-ml-eng-lug-full-a40-5e-5
This is a multilingual model and is a fine-tuned version of [Sunbird/asr-whisper-large-v3-salt](https://huggingface.co/Sunbird/asr-whisper-large-v3-salt) on the Ugandan CDLI Atypical speech datasets.
It achieves the following results on the evaluation set:
- Loss: 1.2283
- Wer: 0.4137
- Cer: 0.2271
On the test set with repetition penalty of 1.3 and no_repeat_ngram_size of 2 it obtains:
- test_cer = 0.1268
- test_loss = 0.8137
- test_runtime = 0:22:24.23
- test_samples_per_second = 1.518
- test_steps_per_second = 0.379
- test_wer = 0.2851
# English
- Overall WER (normalized): 0.224
- Overall CER (normalized): 0.135
- Avg WER (normalized): 0.214
- Avg CER (normalized): 0.133
# Luganda
- Overall WER (normalized): 0.414
- Overall CER (normalized): 0.146
- Avg WER (normalized): 0.354
- Avg CER (normalized): 0.12
## Model description
The training was resumed from epoch 7.2255 and the Wer reported after that is a bit dirty (CER instead of WER)
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 150
- training_steps: 4000
### Training results
| Training Loss | Epoch | Step | Cer | Validation Loss | Wer |
|:-------------:|:-------:|:----:|:------:|:---------------:|:------:|
| 0.6253 | 0.7228 | 250 | 0.2722 | 0.8156 | 0.4660 |
| 0.4188 | 1.4452 | 500 | 0.2362 | 0.8119 | 0.4247 |
| 0.2709 | 2.1677 | 750 | 0.2352 | 0.8229 | 0.4206 |
| 0.2571 | 2.8905 | 1000 | 0.2261 | 0.8141 | 0.4153 |
| 0.1581 | 3.6129 | 1250 | 0.2292 | 0.9097 | 0.4167 |
| 0.083 | 4.3354 | 1500 | 0.2271 | 0.9749 | 0.4177 |
| 0.0593 | 5.0578 | 1750 | 0.2266 | 1.0613 | 0.4107 |
| 0.0518 | 5.7806 | 2000 | 0.2235 | 1.0547 | 0.4108 |
| 0.0382 | 6.5031 | 2250 | 0.2249 | 1.1098 | 0.4095 |
| 0.0356 | 7.2255 | 2500 | 0.2238 | 1.1149 | 0.4087 |
| 0.0408 | 7.9483 | 2750 | 1.1168 | 0.4139 | 0.2261 |
| 0.0368 | 8.6737 | 3000 | 1.1499 | 0.4172 | 0.2279 |
| 0.0271 | 9.3961 | 3250 | 1.2052 | 0.4132 | 0.2271 |
| 0.0237 | 10.1185 | 3500 | 1.2107 | 0.4114 | 0.2263 |
| 0.0212 | 10.8413 | 3750 | 1.2275 | 0.4111 | 0.2250 |
| 0.0221 | 11.5638 | 4000 | 1.2283 | 0.4137 | 0.2271 |
### Framework versions
- Transformers 4.52.0
- Pytorch 2.7.1+cu118
- Datasets 3.6.0
- Tokenizers 0.21.4

1611
added_tokens.json Normal file

File diff suppressed because it is too large Load Diff

14
all_results.json Normal file
View File

@@ -0,0 +1,14 @@
{
"epoch": 11.563787495482472,
"test_cer": 0.12683188844935386,
"test_loss": 0.8137470483779907,
"test_runtime": 1344.2304,
"test_samples_per_second": 1.518,
"test_steps_per_second": 0.379,
"test_wer": 0.28512065990166696,
"total_flos": 4.347268958158848e+20,
"train_loss": 0.010958212658762932,
"train_runtime": 14886.6676,
"train_samples_per_second": 8.598,
"train_steps_per_second": 0.269
}

47
config.json Normal file
View File

@@ -0,0 +1,47 @@
{
"activation_dropout": 0.0,
"activation_function": "gelu",
"apply_spec_augment": true,
"architectures": [
"WhisperForConditionalGeneration"
],
"attention_dropout": 0.0,
"begin_suppress_tokens": null,
"bos_token_id": 50257,
"classifier_proj_size": 256,
"d_model": 1280,
"decoder_attention_heads": 20,
"decoder_ffn_dim": 5120,
"decoder_layerdrop": 0.0,
"decoder_layers": 32,
"decoder_start_token_id": 50258,
"dropout": 0.0,
"encoder_attention_heads": 20,
"encoder_ffn_dim": 5120,
"encoder_layerdrop": 0.0,
"encoder_layers": 32,
"eos_token_id": 50257,
"forced_decoder_ids": null,
"init_std": 0.02,
"is_encoder_decoder": true,
"mask_feature_length": 10,
"mask_feature_min_masks": 2,
"mask_feature_prob": 0.05,
"mask_time_length": 10,
"mask_time_min_masks": 2,
"mask_time_prob": 0.05,
"max_length": null,
"max_source_positions": 1500,
"max_target_positions": 448,
"median_filter_width": 7,
"model_type": "whisper",
"num_hidden_layers": 32,
"num_mel_bins": 128,
"pad_token_id": 50256,
"scale_embedding": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.52.0",
"use_cache": false,
"use_weighted_layer_sum": false,
"vocab_size": 51866
}

89
eval_metrics.json Normal file
View File

@@ -0,0 +1,89 @@
{
"trainer_test": {
"test_loss": 0.8137470483779907,
"test_wer": 0.28512065990166696,
"test_cer": 0.12683188844935386,
"test_runtime": 1344.2304,
"test_samples_per_second": 1.518,
"test_steps_per_second": 0.379,
"epoch": 11.563787495482472
},
"trainer_test_utterance_avg": {
"wer": 0.28512065990166696,
"cer": 0.12683188844935386,
"loss": 0.8137470483779907
},
"reference_style_overall": {
"wer": 0.3410804796637409,
"cer": 0.16023962034179612,
"utterance_avg_wer": 0.3189475550725116,
"utterance_avg_cer": 0.13851965646586686
},
"reference_style_overall_corpus": {
"wer": 0.3410804796637409,
"cer": 0.16023962034179612
},
"reference_style_overall_utterance_avg": {
"wer": 0.3189475550725116,
"cer": 0.13851965646586686
},
"reference_style_per_language": [
{
"language": "eng",
"n": 1013,
"wer": 0.24436512392716772,
"cer": 0.14721321419406014,
"utterance_avg_wer": 0.22532790099687414,
"utterance_avg_cer": 0.13886085242721155
},
{
"language": "lug",
"n": 1027,
"wer": 0.4965508348913674,
"cer": 0.17653755217607053,
"utterance_avg_wer": 0.4112909918579261,
"utterance_avg_cer": 0.13818311166660482
}
],
"reference_style_per_severity_lug": [
{
"severity_speech_impairment": "Mild (easily understood with minimal effort)",
"n": 365,
"wer": 0.4485981308411215,
"cer": 0.15233985165205663,
"utterance_avg_wer": 0.3566426067511471,
"utterance_avg_cer": 0.11296899355913631
},
{
"severity_speech_impairment": "Moderate (requires effort to understand)",
"n": 347,
"wer": 0.45241608262633715,
"cer": 0.1289565287784344,
"utterance_avg_wer": 0.4098227847004097,
"utterance_avg_cer": 0.11154220353941666
},
{
"severity_speech_impairment": "Severe (frequent breakdowns)",
"n": 315,
"wer": 0.6083977900552486,
"cer": 0.2694201026428796,
"utterance_avg_wer": 0.47623108216787063,
"utterance_avg_cer": 0.19674675683917717
}
],
"old_luganda_test": {
"n": 1028,
"wer": 0.6278914293354727,
"cer": 0.21834629171755687,
"utterance_avg_wer": 0.5785237813292773,
"utterance_avg_cer": 0.18847034300324522,
"note": "Uncleaned HF Luganda test split with the same <=30s and high-confidence filters."
},
"decode_kwargs": {
"max_length": 448,
"num_beams": 1,
"do_sample": false,
"repetition_penalty": 1.3,
"no_repeat_ngram_size": 2
}
}

46
experiment_summary.json Normal file
View File

@@ -0,0 +1,46 @@
{
"experiment_id": "ML-ENG-LUG-FULL-A40-5e-5",
"base_model": "Sunbird/asr-whisper-large-v3-salt",
"languages": [
"lug",
"eng"
],
"training_method": "full",
"lora_r": 16,
"lora_alpha": 32,
"lora_dropout": 0.15,
"learning_rate": 5e-05,
"optimizer": "adamw_torch_fused",
"effective_batch_size": 32,
"train_metrics": {
"train_runtime": 14886.6676,
"train_samples_per_second": 8.598,
"train_steps_per_second": 0.269,
"total_flos": 4.347268958158848e+20,
"train_loss": 0.010958212658762932,
"epoch": 11.563787495482472
},
"test_metrics": {
"test_loss": 0.8137470483779907,
"test_wer": 0.28512065990166696,
"test_cer": 0.12683188844935386,
"test_runtime": 1344.2304,
"test_samples_per_second": 1.518,
"test_steps_per_second": 0.379,
"epoch": 11.563787495482472
},
"trainer_test_utterance_avg_metrics": {
"wer": 0.28512065990166696,
"cer": 0.12683188844935386
},
"final_corpus_metrics": {
"wer": 0.3410804796637409,
"cer": 0.16023962034179612
},
"final_utterance_avg_metrics": {
"wer": 0.3189475550725116,
"cer": 0.13851965646586686
},
"resume_from_checkpoint": "results/finetuned/ML-ENG-LUG-FULL-A40-5e-5/checkpoint-2500",
"hub_repo_id": "KasuleTrevor/cdli-whisper-ml-eng-lug-full-a40-5e-5"
}

168
generation_config.json Normal file
View File

@@ -0,0 +1,168 @@
{
"alignment_heads": [
[
7,
0
],
[
10,
17
],
[
12,
18
],
[
13,
12
],
[
16,
1
],
[
17,
14
],
[
19,
11
],
[
21,
4
],
[
24,
1
],
[
25,
6
]
],
"begin_suppress_tokens": [
220,
50257
],
"bos_token_id": 50257,
"decoder_start_token_id": 50258,
"eos_token_id": 50257,
"is_multilingual": true,
"lang_to_id": {
"<|af|>": 50327,
"<|am|>": 50334,
"<|ar|>": 50272,
"<|as|>": 50350,
"<|az|>": 50304,
"<|ba|>": 50355,
"<|be|>": 50330,
"<|bg|>": 50292,
"<|bn|>": 50302,
"<|bo|>": 50347,
"<|br|>": 50309,
"<|bs|>": 50315,
"<|ca|>": 50270,
"<|cs|>": 50283,
"<|cy|>": 50297,
"<|da|>": 50285,
"<|de|>": 50261,
"<|el|>": 50281,
"<|en|>": 50259,
"<|es|>": 50262,
"<|et|>": 50307,
"<|eu|>": 50310,
"<|fa|>": 50300,
"<|fi|>": 50277,
"<|fo|>": 50338,
"<|fr|>": 50265,
"<|gl|>": 50319,
"<|gu|>": 50333,
"<|haw|>": 50352,
"<|ha|>": 50354,
"<|he|>": 50279,
"<|hi|>": 50276,
"<|hr|>": 50291,
"<|ht|>": 50339,
"<|hu|>": 50286,
"<|hy|>": 50312,
"<|id|>": 50275,
"<|is|>": 50311,
"<|it|>": 50274,
"<|ja|>": 50266,
"<|jw|>": 50356,
"<|ka|>": 50329,
"<|kk|>": 50316,
"<|km|>": 50323,
"<|kn|>": 50306,
"<|ko|>": 50264,
"<|la|>": 50294,
"<|lb|>": 50345,
"<|ln|>": 50353,
"<|lo|>": 50336,
"<|lt|>": 50293,
"<|lv|>": 50301,
"<|mg|>": 50349,
"<|mi|>": 50295,
"<|mk|>": 50308,
"<|ml|>": 50296,
"<|mn|>": 50314,
"<|mr|>": 50320,
"<|ms|>": 50282,
"<|mt|>": 50343,
"<|my|>": 50346,
"<|ne|>": 50313,
"<|nl|>": 50271,
"<|nn|>": 50342,
"<|no|>": 50288,
"<|oc|>": 50328,
"<|pa|>": 50321,
"<|pl|>": 50269,
"<|ps|>": 50340,
"<|pt|>": 50267,
"<|ro|>": 50284,
"<|ru|>": 50263,
"<|sa|>": 50344,
"<|sd|>": 50332,
"<|si|>": 50322,
"<|sk|>": 50298,
"<|sl|>": 50305,
"<|sn|>": 50324,
"<|so|>": 50326,
"<|sq|>": 50317,
"<|sr|>": 50303,
"<|su|>": 50357,
"<|sv|>": 50273,
"<|sw|>": 50318,
"<|ta|>": 50287,
"<|te|>": 50299,
"<|tg|>": 50331,
"<|th|>": 50289,
"<|tk|>": 50341,
"<|tl|>": 50348,
"<|tr|>": 50268,
"<|tt|>": 50351,
"<|uk|>": 50280,
"<|ur|>": 50290,
"<|uz|>": 50337,
"<|vi|>": 50278,
"<|yi|>": 50335,
"<|yo|>": 50325,
"<|yue|>": 50358,
"<|zh|>": 50260
},
"language": null,
"max_initial_timestamp_index": 50,
"max_length": 448,
"no_timestamps_token_id": 50364,
"pad_token_id": 50257,
"prev_sot_token_id": 50362,
"return_timestamps": false,
"suppress_tokens": [],
"task": null,
"task_to_id": {
"transcribe": 50360,
"translate": 50359
},
"transformers_version": "4.52.0"
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,47 @@
{
"activation_dropout": 0.0,
"activation_function": "gelu",
"apply_spec_augment": true,
"architectures": [
"WhisperForConditionalGeneration"
],
"attention_dropout": 0.0,
"begin_suppress_tokens": null,
"bos_token_id": 50257,
"classifier_proj_size": 256,
"d_model": 1280,
"decoder_attention_heads": 20,
"decoder_ffn_dim": 5120,
"decoder_layerdrop": 0.0,
"decoder_layers": 32,
"decoder_start_token_id": 50258,
"dropout": 0.0,
"encoder_attention_heads": 20,
"encoder_ffn_dim": 5120,
"encoder_layerdrop": 0.0,
"encoder_layers": 32,
"eos_token_id": 50257,
"forced_decoder_ids": null,
"init_std": 0.02,
"is_encoder_decoder": true,
"mask_feature_length": 10,
"mask_feature_min_masks": 2,
"mask_feature_prob": 0.05,
"mask_time_length": 10,
"mask_time_min_masks": 2,
"mask_time_prob": 0.05,
"max_length": null,
"max_source_positions": 1500,
"max_target_positions": 448,
"median_filter_width": 7,
"model_type": "whisper",
"num_hidden_layers": 32,
"num_mel_bins": 128,
"pad_token_id": 50256,
"scale_embedding": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.52.0",
"use_cache": false,
"use_weighted_layer_sum": false,
"vocab_size": 51866
}

View File

@@ -0,0 +1,168 @@
{
"alignment_heads": [
[
7,
0
],
[
10,
17
],
[
12,
18
],
[
13,
12
],
[
16,
1
],
[
17,
14
],
[
19,
11
],
[
21,
4
],
[
24,
1
],
[
25,
6
]
],
"begin_suppress_tokens": [
220,
50257
],
"bos_token_id": 50257,
"decoder_start_token_id": 50258,
"eos_token_id": 50257,
"is_multilingual": true,
"lang_to_id": {
"<|af|>": 50327,
"<|am|>": 50334,
"<|ar|>": 50272,
"<|as|>": 50350,
"<|az|>": 50304,
"<|ba|>": 50355,
"<|be|>": 50330,
"<|bg|>": 50292,
"<|bn|>": 50302,
"<|bo|>": 50347,
"<|br|>": 50309,
"<|bs|>": 50315,
"<|ca|>": 50270,
"<|cs|>": 50283,
"<|cy|>": 50297,
"<|da|>": 50285,
"<|de|>": 50261,
"<|el|>": 50281,
"<|en|>": 50259,
"<|es|>": 50262,
"<|et|>": 50307,
"<|eu|>": 50310,
"<|fa|>": 50300,
"<|fi|>": 50277,
"<|fo|>": 50338,
"<|fr|>": 50265,
"<|gl|>": 50319,
"<|gu|>": 50333,
"<|haw|>": 50352,
"<|ha|>": 50354,
"<|he|>": 50279,
"<|hi|>": 50276,
"<|hr|>": 50291,
"<|ht|>": 50339,
"<|hu|>": 50286,
"<|hy|>": 50312,
"<|id|>": 50275,
"<|is|>": 50311,
"<|it|>": 50274,
"<|ja|>": 50266,
"<|jw|>": 50356,
"<|ka|>": 50329,
"<|kk|>": 50316,
"<|km|>": 50323,
"<|kn|>": 50306,
"<|ko|>": 50264,
"<|la|>": 50294,
"<|lb|>": 50345,
"<|ln|>": 50353,
"<|lo|>": 50336,
"<|lt|>": 50293,
"<|lv|>": 50301,
"<|mg|>": 50349,
"<|mi|>": 50295,
"<|mk|>": 50308,
"<|ml|>": 50296,
"<|mn|>": 50314,
"<|mr|>": 50320,
"<|ms|>": 50282,
"<|mt|>": 50343,
"<|my|>": 50346,
"<|ne|>": 50313,
"<|nl|>": 50271,
"<|nn|>": 50342,
"<|no|>": 50288,
"<|oc|>": 50328,
"<|pa|>": 50321,
"<|pl|>": 50269,
"<|ps|>": 50340,
"<|pt|>": 50267,
"<|ro|>": 50284,
"<|ru|>": 50263,
"<|sa|>": 50344,
"<|sd|>": 50332,
"<|si|>": 50322,
"<|sk|>": 50298,
"<|sl|>": 50305,
"<|sn|>": 50324,
"<|so|>": 50326,
"<|sq|>": 50317,
"<|sr|>": 50303,
"<|su|>": 50357,
"<|sv|>": 50273,
"<|sw|>": 50318,
"<|ta|>": 50287,
"<|te|>": 50299,
"<|tg|>": 50331,
"<|th|>": 50289,
"<|tk|>": 50341,
"<|tl|>": 50348,
"<|tr|>": 50268,
"<|tt|>": 50351,
"<|uk|>": 50280,
"<|ur|>": 50290,
"<|uz|>": 50337,
"<|vi|>": 50278,
"<|yi|>": 50335,
"<|yo|>": 50325,
"<|yue|>": 50358,
"<|zh|>": 50260
},
"language": null,
"max_initial_timestamp_index": 50,
"max_length": 448,
"no_timestamps_token_id": 50364,
"pad_token_id": 50257,
"prev_sot_token_id": 50362,
"return_timestamps": false,
"suppress_tokens": [],
"task": "transcribe",
"task_to_id": {
"transcribe": 50360,
"translate": 50359
},
"transformers_version": "4.52.0"
}

50001
last-checkpoint/merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d86b51a28ec1c55503d1ccc559b47d4c73f9028dc82540618c1f7fa947097120
size 3087132232

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:24630d6bdf5f12f6f5567dff03a1da5ae29b38fb50e030f0aa9e8476cba53b98
size 6175072292

View File

@@ -0,0 +1,15 @@
{
"chunk_length": 30,
"dither": 0.0,
"feature_extractor_type": "WhisperFeatureExtractor",
"feature_size": 128,
"hop_length": 160,
"n_fft": 400,
"n_samples": 480000,
"nb_max_frames": 3000,
"padding_side": "right",
"padding_value": 0.0,
"processor_class": "WhisperProcessor",
"return_attention_mask": false,
"sampling_rate": 16000
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f020f2c25dd3371d3f47f373cf66bcee3219d79c64bdfa46d46cf8f4fa92d182
size 14709

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a055373b6616927821990112c9bcc54deefdc796454a5ecd2e7b2953f9568926
size 1465

View File

@@ -0,0 +1,139 @@
{
"additional_special_tokens": [
"<|startoftranscript|>",
"<|en|>",
"<|zh|>",
"<|de|>",
"<|es|>",
"<|ru|>",
"<|ko|>",
"<|fr|>",
"<|ja|>",
"<|pt|>",
"<|tr|>",
"<|pl|>",
"<|ca|>",
"<|nl|>",
"<|ar|>",
"<|sv|>",
"<|it|>",
"<|id|>",
"<|hi|>",
"<|fi|>",
"<|vi|>",
"<|he|>",
"<|uk|>",
"<|el|>",
"<|ms|>",
"<|cs|>",
"<|ro|>",
"<|da|>",
"<|hu|>",
"<|ta|>",
"<|no|>",
"<|th|>",
"<|ur|>",
"<|hr|>",
"<|bg|>",
"<|lt|>",
"<|la|>",
"<|mi|>",
"<|ml|>",
"<|cy|>",
"<|sk|>",
"<|te|>",
"<|fa|>",
"<|lv|>",
"<|bn|>",
"<|sr|>",
"<|az|>",
"<|sl|>",
"<|kn|>",
"<|et|>",
"<|mk|>",
"<|br|>",
"<|eu|>",
"<|is|>",
"<|hy|>",
"<|ne|>",
"<|mn|>",
"<|bs|>",
"<|kk|>",
"<|sq|>",
"<|sw|>",
"<|gl|>",
"<|mr|>",
"<|pa|>",
"<|si|>",
"<|km|>",
"<|sn|>",
"<|yo|>",
"<|so|>",
"<|af|>",
"<|oc|>",
"<|ka|>",
"<|be|>",
"<|tg|>",
"<|sd|>",
"<|gu|>",
"<|am|>",
"<|yi|>",
"<|lo|>",
"<|uz|>",
"<|fo|>",
"<|ht|>",
"<|ps|>",
"<|tk|>",
"<|nn|>",
"<|mt|>",
"<|sa|>",
"<|lb|>",
"<|my|>",
"<|bo|>",
"<|tl|>",
"<|mg|>",
"<|as|>",
"<|tt|>",
"<|haw|>",
"<|ln|>",
"<|ha|>",
"<|ba|>",
"<|jw|>",
"<|su|>",
"<|yue|>",
"<|translate|>",
"<|transcribe|>",
"<|startoflm|>",
"<|startofprev|>",
"<|nospeech|>",
"<|notimestamps|>"
],
"bos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e9ae1e14cb5254830c9ba92acab84551412f3fc17a4b6d27f67573489c81cef9
size 6033

50259
last-checkpoint/vocab.json Normal file

File diff suppressed because it is too large Load Diff

50001
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:81439ab334a61b8ea201d0fab1cdaebaac35dc3fdbf6e915d6a4061a5dbcbbaf
size 3087132232

1742
normalizer.json Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

2041
predictions.tsv Normal file

File diff suppressed because it is too large Load Diff

15
preprocessor_config.json Normal file
View File

@@ -0,0 +1,15 @@
{
"chunk_length": 30,
"dither": 0.0,
"feature_extractor_type": "WhisperFeatureExtractor",
"feature_size": 128,
"hop_length": 160,
"n_fft": 400,
"n_samples": 480000,
"nb_max_frames": 3000,
"padding_side": "right",
"padding_value": 0.0,
"processor_class": "WhisperProcessor",
"return_attention_mask": false,
"sampling_rate": 16000
}

139
special_tokens_map.json Normal file
View File

@@ -0,0 +1,139 @@
{
"additional_special_tokens": [
"<|startoftranscript|>",
"<|en|>",
"<|zh|>",
"<|de|>",
"<|es|>",
"<|ru|>",
"<|ko|>",
"<|fr|>",
"<|ja|>",
"<|pt|>",
"<|tr|>",
"<|pl|>",
"<|ca|>",
"<|nl|>",
"<|ar|>",
"<|sv|>",
"<|it|>",
"<|id|>",
"<|hi|>",
"<|fi|>",
"<|vi|>",
"<|he|>",
"<|uk|>",
"<|el|>",
"<|ms|>",
"<|cs|>",
"<|ro|>",
"<|da|>",
"<|hu|>",
"<|ta|>",
"<|no|>",
"<|th|>",
"<|ur|>",
"<|hr|>",
"<|bg|>",
"<|lt|>",
"<|la|>",
"<|mi|>",
"<|ml|>",
"<|cy|>",
"<|sk|>",
"<|te|>",
"<|fa|>",
"<|lv|>",
"<|bn|>",
"<|sr|>",
"<|az|>",
"<|sl|>",
"<|kn|>",
"<|et|>",
"<|mk|>",
"<|br|>",
"<|eu|>",
"<|is|>",
"<|hy|>",
"<|ne|>",
"<|mn|>",
"<|bs|>",
"<|kk|>",
"<|sq|>",
"<|sw|>",
"<|gl|>",
"<|mr|>",
"<|pa|>",
"<|si|>",
"<|km|>",
"<|sn|>",
"<|yo|>",
"<|so|>",
"<|af|>",
"<|oc|>",
"<|ka|>",
"<|be|>",
"<|tg|>",
"<|sd|>",
"<|gu|>",
"<|am|>",
"<|yi|>",
"<|lo|>",
"<|uz|>",
"<|fo|>",
"<|ht|>",
"<|ps|>",
"<|tk|>",
"<|nn|>",
"<|mt|>",
"<|sa|>",
"<|lb|>",
"<|my|>",
"<|bo|>",
"<|tl|>",
"<|mg|>",
"<|as|>",
"<|tt|>",
"<|haw|>",
"<|ln|>",
"<|ha|>",
"<|ba|>",
"<|jw|>",
"<|su|>",
"<|yue|>",
"<|translate|>",
"<|transcribe|>",
"<|startoflm|>",
"<|startofprev|>",
"<|nospeech|>",
"<|notimestamps|>"
],
"bos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

9
test_results.json Normal file
View File

@@ -0,0 +1,9 @@
{
"epoch": 11.563787495482472,
"test_cer": 0.12683188844935386,
"test_loss": 0.8137470483779907,
"test_runtime": 1344.2304,
"test_samples_per_second": 1.518,
"test_steps_per_second": 0.379,
"test_wer": 0.28512065990166696
}

12997
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff

8
train_results.json Normal file
View File

@@ -0,0 +1,8 @@
{
"epoch": 11.563787495482472,
"total_flos": 4.347268958158848e+20,
"train_loss": 0.010958212658762932,
"train_runtime": 14886.6676,
"train_samples_per_second": 8.598,
"train_steps_per_second": 0.269
}

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e9ae1e14cb5254830c9ba92acab84551412f3fc17a4b6d27f67573489c81cef9
size 6033

50259
vocab.json Normal file

File diff suppressed because it is too large Load Diff