初始化项目,由ModelHub XC社区提供模型

Model: KasuleTrevor/cdli-whisper-ml-eng-lug-full-a40
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-12 11:00:36 +08:00
commit 48f2c22c27
35 changed files with 238141 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

83
README.md Normal file
View File

@@ -0,0 +1,83 @@
---
library_name: transformers
license: mit
base_model: Sunbird/asr-whisper-large-v3-salt
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: cdli-whisper-ml-eng-lug-full-a40
results: []
datasets:
- cdli/ugandan_luganda_nonstandard_speech_v1.0
- cdli/ugandan_english_nonstandard_speech_v1.0
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# cdli-whisper-ml-eng-lug-full-a40
This model is a fine-tuned version of [Sunbird/asr-whisper-large-v3-salt](https://huggingface.co/Sunbird/asr-whisper-large-v3-salt) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.7960
- Wer: 0.4932
- Cer: 0.3124
- test_cer = 0.1888
- test_loss = 0.5511
- test_runtime = 0:30:54.57
- test_samples_per_second = 1.1
- test_steps_per_second = 0.275
- test_wer = 0.3478
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 150
- training_steps: 2500
### Training results
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
|:-------------:|:------:|:----:|:---------------:|:------:|:------:|
| 0.7765 | 0.3615 | 250 | 0.8623 | 0.5111 | 0.3216 |
| 0.6632 | 0.7229 | 500 | 0.8226 | 0.5051 | 0.3236 |
| 0.5672 | 1.0839 | 750 | 0.8067 | 0.4862 | 0.3044 |
| 0.6058 | 1.4453 | 1000 | 0.7991 | 0.4949 | 0.3142 |
| 0.6589 | 1.8068 | 1250 | 0.7972 | 0.4901 | 0.3106 |
| 0.5959 | 2.1677 | 1500 | 0.7977 | 0.4926 | 0.3118 |
| 0.5402 | 2.5292 | 1750 | 0.7964 | 0.4926 | 0.3114 |
| 0.5934 | 2.8907 | 2000 | 0.7964 | 0.4921 | 0.3118 |
| 0.5464 | 3.2516 | 2250 | 0.7960 | 0.4931 | 0.3113 |
| 0.5497 | 3.6130 | 2500 | 0.7960 | 0.4932 | 0.3124 |
### Framework versions
- Transformers 4.52.0
- Pytorch 2.7.1+cu118
- Datasets 3.6.0
- Tokenizers 0.21.4

1611
added_tokens.json Normal file

File diff suppressed because it is too large Load Diff

14
all_results.json Normal file
View File

@@ -0,0 +1,14 @@
{
"epoch": 3.613048978854148,
"test_cer": 0.18876529976505252,
"test_loss": 0.5511467456817627,
"test_runtime": 1854.5701,
"test_samples_per_second": 1.1,
"test_steps_per_second": 0.275,
"test_wer": 0.34776840202900167,
"total_flos": 1.3582858479796224e+20,
"train_loss": 0.6222768672943115,
"train_runtime": 23999.707,
"train_samples_per_second": 1.667,
"train_steps_per_second": 0.104
}

47
config.json Normal file
View File

@@ -0,0 +1,47 @@
{
"activation_dropout": 0.0,
"activation_function": "gelu",
"apply_spec_augment": true,
"architectures": [
"WhisperForConditionalGeneration"
],
"attention_dropout": 0.0,
"begin_suppress_tokens": null,
"bos_token_id": 50257,
"classifier_proj_size": 256,
"d_model": 1280,
"decoder_attention_heads": 20,
"decoder_ffn_dim": 5120,
"decoder_layerdrop": 0.0,
"decoder_layers": 32,
"decoder_start_token_id": 50258,
"dropout": 0.0,
"encoder_attention_heads": 20,
"encoder_ffn_dim": 5120,
"encoder_layerdrop": 0.0,
"encoder_layers": 32,
"eos_token_id": 50257,
"forced_decoder_ids": null,
"init_std": 0.02,
"is_encoder_decoder": true,
"mask_feature_length": 10,
"mask_feature_min_masks": 2,
"mask_feature_prob": 0.05,
"mask_time_length": 10,
"mask_time_min_masks": 2,
"mask_time_prob": 0.05,
"max_length": null,
"max_source_positions": 1500,
"max_target_positions": 448,
"median_filter_width": 7,
"model_type": "whisper",
"num_hidden_layers": 32,
"num_mel_bins": 128,
"pad_token_id": 50256,
"scale_embedding": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.52.0",
"use_cache": false,
"use_weighted_layer_sum": false,
"vocab_size": 51866
}

76
eval_metrics.json Normal file
View File

@@ -0,0 +1,76 @@
{
"trainer_test": {
"test_loss": 0.5511467456817627,
"test_wer": 0.34776840202900167,
"test_cer": 0.18876529976505252,
"test_runtime": 1854.5701,
"test_samples_per_second": 1.1,
"test_steps_per_second": 0.275,
"epoch": 3.613048978854148
},
"reference_style_overall": {
"wer": 0.371121275806651,
"cer": 0.17606607826695828,
"utterance_avg_wer": 0.3521848911073324,
"utterance_avg_cer": 0.15366339107087282
},
"reference_style_per_language": [
{
"language": "eng",
"n": 1013,
"wer": 0.26421753429052697,
"cer": 0.1609002325237773,
"utterance_avg_wer": 0.24615732101458704,
"utterance_avg_cer": 0.1519399161596408
},
{
"language": "lug",
"n": 1027,
"wer": 0.5429695055122171,
"cer": 0.19504149922645053,
"utterance_avg_wer": 0.4567670999719392,
"utterance_avg_cer": 0.15536337167951744
}
],
"reference_style_per_severity_lug": [
{
"severity_speech_impairment": "Mild (easily understood with minimal effort)",
"n": 365,
"wer": 0.4899352983465133,
"cer": 0.16129902357447268,
"utterance_avg_wer": 0.39321992638128483,
"utterance_avg_cer": 0.11936053558848729
},
{
"severity_speech_impairment": "Moderate (requires effort to understand)",
"n": 347,
"wer": 0.4974179269642198,
"cer": 0.15035889686437476,
"utterance_avg_wer": 0.45218892375959957,
"utterance_avg_cer": 0.13221225352708516
},
{
"severity_speech_impairment": "Severe (frequent breakdowns)",
"n": 315,
"wer": 0.6627624309392265,
"cer": 0.29653294228553473,
"utterance_avg_wer": 0.5354443872934337,
"utterance_avg_cer": 0.22258392143227942
}
],
"old_luganda_test": {
"n": 1028,
"wer": 0.6604492579221821,
"cer": 0.2353163136624068,
"utterance_avg_wer": 0.6130353642389026,
"utterance_avg_cer": 0.20360531726376055,
"note": "Uncleaned HF Luganda test split with the same <=30s and high-confidence filters."
},
"decode_kwargs": {
"max_length": 448,
"num_beams": 1,
"do_sample": false,
"repetition_penalty": 1.3,
"no_repeat_ngram_size": 2
}
}

33
experiment_summary.json Normal file
View File

@@ -0,0 +1,33 @@
{
"experiment_id": "ML-ENG-LUG-FULL-A40",
"base_model": "Sunbird/asr-whisper-large-v3-salt",
"languages": [
"lug",
"eng"
],
"training_method": "full",
"lora_r": 16,
"lora_alpha": 32,
"lora_dropout": 0.15,
"learning_rate": 1e-05,
"optimizer": "adamw_torch_fused",
"effective_batch_size": 16,
"train_metrics": {
"train_runtime": 23999.707,
"train_samples_per_second": 1.667,
"train_steps_per_second": 0.104,
"total_flos": 1.3582858479796224e+20,
"train_loss": 0.6222768672943115,
"epoch": 3.613048978854148
},
"test_metrics": {
"test_loss": 0.5511467456817627,
"test_wer": 0.34776840202900167,
"test_cer": 0.18876529976505252,
"test_runtime": 1854.5701,
"test_samples_per_second": 1.1,
"test_steps_per_second": 0.275,
"epoch": 3.613048978854148
},
"hub_repo_id": "KasuleTrevor/cdli-whisper-ml-eng-lug-full-a40"
}

168
generation_config.json Normal file
View File

@@ -0,0 +1,168 @@
{
"alignment_heads": [
[
7,
0
],
[
10,
17
],
[
12,
18
],
[
13,
12
],
[
16,
1
],
[
17,
14
],
[
19,
11
],
[
21,
4
],
[
24,
1
],
[
25,
6
]
],
"begin_suppress_tokens": [
220,
50257
],
"bos_token_id": 50257,
"decoder_start_token_id": 50258,
"eos_token_id": 50257,
"is_multilingual": true,
"lang_to_id": {
"<|af|>": 50327,
"<|am|>": 50334,
"<|ar|>": 50272,
"<|as|>": 50350,
"<|az|>": 50304,
"<|ba|>": 50355,
"<|be|>": 50330,
"<|bg|>": 50292,
"<|bn|>": 50302,
"<|bo|>": 50347,
"<|br|>": 50309,
"<|bs|>": 50315,
"<|ca|>": 50270,
"<|cs|>": 50283,
"<|cy|>": 50297,
"<|da|>": 50285,
"<|de|>": 50261,
"<|el|>": 50281,
"<|en|>": 50259,
"<|es|>": 50262,
"<|et|>": 50307,
"<|eu|>": 50310,
"<|fa|>": 50300,
"<|fi|>": 50277,
"<|fo|>": 50338,
"<|fr|>": 50265,
"<|gl|>": 50319,
"<|gu|>": 50333,
"<|haw|>": 50352,
"<|ha|>": 50354,
"<|he|>": 50279,
"<|hi|>": 50276,
"<|hr|>": 50291,
"<|ht|>": 50339,
"<|hu|>": 50286,
"<|hy|>": 50312,
"<|id|>": 50275,
"<|is|>": 50311,
"<|it|>": 50274,
"<|ja|>": 50266,
"<|jw|>": 50356,
"<|ka|>": 50329,
"<|kk|>": 50316,
"<|km|>": 50323,
"<|kn|>": 50306,
"<|ko|>": 50264,
"<|la|>": 50294,
"<|lb|>": 50345,
"<|ln|>": 50353,
"<|lo|>": 50336,
"<|lt|>": 50293,
"<|lv|>": 50301,
"<|mg|>": 50349,
"<|mi|>": 50295,
"<|mk|>": 50308,
"<|ml|>": 50296,
"<|mn|>": 50314,
"<|mr|>": 50320,
"<|ms|>": 50282,
"<|mt|>": 50343,
"<|my|>": 50346,
"<|ne|>": 50313,
"<|nl|>": 50271,
"<|nn|>": 50342,
"<|no|>": 50288,
"<|oc|>": 50328,
"<|pa|>": 50321,
"<|pl|>": 50269,
"<|ps|>": 50340,
"<|pt|>": 50267,
"<|ro|>": 50284,
"<|ru|>": 50263,
"<|sa|>": 50344,
"<|sd|>": 50332,
"<|si|>": 50322,
"<|sk|>": 50298,
"<|sl|>": 50305,
"<|sn|>": 50324,
"<|so|>": 50326,
"<|sq|>": 50317,
"<|sr|>": 50303,
"<|su|>": 50357,
"<|sv|>": 50273,
"<|sw|>": 50318,
"<|ta|>": 50287,
"<|te|>": 50299,
"<|tg|>": 50331,
"<|th|>": 50289,
"<|tk|>": 50341,
"<|tl|>": 50348,
"<|tr|>": 50268,
"<|tt|>": 50351,
"<|uk|>": 50280,
"<|ur|>": 50290,
"<|uz|>": 50337,
"<|vi|>": 50278,
"<|yi|>": 50335,
"<|yo|>": 50325,
"<|yue|>": 50358,
"<|zh|>": 50260
},
"language": null,
"max_initial_timestamp_index": 50,
"max_length": 448,
"no_timestamps_token_id": 50364,
"pad_token_id": 50257,
"prev_sot_token_id": 50362,
"return_timestamps": false,
"suppress_tokens": [],
"task": null,
"task_to_id": {
"transcribe": 50360,
"translate": 50359
},
"transformers_version": "4.52.0"
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,47 @@
{
"activation_dropout": 0.0,
"activation_function": "gelu",
"apply_spec_augment": true,
"architectures": [
"WhisperForConditionalGeneration"
],
"attention_dropout": 0.0,
"begin_suppress_tokens": null,
"bos_token_id": 50257,
"classifier_proj_size": 256,
"d_model": 1280,
"decoder_attention_heads": 20,
"decoder_ffn_dim": 5120,
"decoder_layerdrop": 0.0,
"decoder_layers": 32,
"decoder_start_token_id": 50258,
"dropout": 0.0,
"encoder_attention_heads": 20,
"encoder_ffn_dim": 5120,
"encoder_layerdrop": 0.0,
"encoder_layers": 32,
"eos_token_id": 50257,
"forced_decoder_ids": null,
"init_std": 0.02,
"is_encoder_decoder": true,
"mask_feature_length": 10,
"mask_feature_min_masks": 2,
"mask_feature_prob": 0.05,
"mask_time_length": 10,
"mask_time_min_masks": 2,
"mask_time_prob": 0.05,
"max_length": null,
"max_source_positions": 1500,
"max_target_positions": 448,
"median_filter_width": 7,
"model_type": "whisper",
"num_hidden_layers": 32,
"num_mel_bins": 128,
"pad_token_id": 50256,
"scale_embedding": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.52.0",
"use_cache": false,
"use_weighted_layer_sum": false,
"vocab_size": 51866
}

View File

@@ -0,0 +1,168 @@
{
"alignment_heads": [
[
7,
0
],
[
10,
17
],
[
12,
18
],
[
13,
12
],
[
16,
1
],
[
17,
14
],
[
19,
11
],
[
21,
4
],
[
24,
1
],
[
25,
6
]
],
"begin_suppress_tokens": [
220,
50257
],
"bos_token_id": 50257,
"decoder_start_token_id": 50258,
"eos_token_id": 50257,
"is_multilingual": true,
"lang_to_id": {
"<|af|>": 50327,
"<|am|>": 50334,
"<|ar|>": 50272,
"<|as|>": 50350,
"<|az|>": 50304,
"<|ba|>": 50355,
"<|be|>": 50330,
"<|bg|>": 50292,
"<|bn|>": 50302,
"<|bo|>": 50347,
"<|br|>": 50309,
"<|bs|>": 50315,
"<|ca|>": 50270,
"<|cs|>": 50283,
"<|cy|>": 50297,
"<|da|>": 50285,
"<|de|>": 50261,
"<|el|>": 50281,
"<|en|>": 50259,
"<|es|>": 50262,
"<|et|>": 50307,
"<|eu|>": 50310,
"<|fa|>": 50300,
"<|fi|>": 50277,
"<|fo|>": 50338,
"<|fr|>": 50265,
"<|gl|>": 50319,
"<|gu|>": 50333,
"<|haw|>": 50352,
"<|ha|>": 50354,
"<|he|>": 50279,
"<|hi|>": 50276,
"<|hr|>": 50291,
"<|ht|>": 50339,
"<|hu|>": 50286,
"<|hy|>": 50312,
"<|id|>": 50275,
"<|is|>": 50311,
"<|it|>": 50274,
"<|ja|>": 50266,
"<|jw|>": 50356,
"<|ka|>": 50329,
"<|kk|>": 50316,
"<|km|>": 50323,
"<|kn|>": 50306,
"<|ko|>": 50264,
"<|la|>": 50294,
"<|lb|>": 50345,
"<|ln|>": 50353,
"<|lo|>": 50336,
"<|lt|>": 50293,
"<|lv|>": 50301,
"<|mg|>": 50349,
"<|mi|>": 50295,
"<|mk|>": 50308,
"<|ml|>": 50296,
"<|mn|>": 50314,
"<|mr|>": 50320,
"<|ms|>": 50282,
"<|mt|>": 50343,
"<|my|>": 50346,
"<|ne|>": 50313,
"<|nl|>": 50271,
"<|nn|>": 50342,
"<|no|>": 50288,
"<|oc|>": 50328,
"<|pa|>": 50321,
"<|pl|>": 50269,
"<|ps|>": 50340,
"<|pt|>": 50267,
"<|ro|>": 50284,
"<|ru|>": 50263,
"<|sa|>": 50344,
"<|sd|>": 50332,
"<|si|>": 50322,
"<|sk|>": 50298,
"<|sl|>": 50305,
"<|sn|>": 50324,
"<|so|>": 50326,
"<|sq|>": 50317,
"<|sr|>": 50303,
"<|su|>": 50357,
"<|sv|>": 50273,
"<|sw|>": 50318,
"<|ta|>": 50287,
"<|te|>": 50299,
"<|tg|>": 50331,
"<|th|>": 50289,
"<|tk|>": 50341,
"<|tl|>": 50348,
"<|tr|>": 50268,
"<|tt|>": 50351,
"<|uk|>": 50280,
"<|ur|>": 50290,
"<|uz|>": 50337,
"<|vi|>": 50278,
"<|yi|>": 50335,
"<|yo|>": 50325,
"<|yue|>": 50358,
"<|zh|>": 50260
},
"language": null,
"max_initial_timestamp_index": 50,
"max_length": 448,
"no_timestamps_token_id": 50364,
"pad_token_id": 50257,
"prev_sot_token_id": 50362,
"return_timestamps": false,
"suppress_tokens": [],
"task": "transcribe",
"task_to_id": {
"transcribe": 50360,
"translate": 50359
},
"transformers_version": "4.52.0"
}

50001
last-checkpoint/merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:454a2fa967f930d32bbfa93a5dcbd2f28eb293254b9576747f86d8c58910d859
size 3087132232

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:71a89a0c14b7a1ba5b17ef5ee505ce74828e88b7a0095dc0460a62b013afddf7
size 6175072292

View File

@@ -0,0 +1,15 @@
{
"chunk_length": 30,
"dither": 0.0,
"feature_extractor_type": "WhisperFeatureExtractor",
"feature_size": 128,
"hop_length": 160,
"n_fft": 400,
"n_samples": 480000,
"nb_max_frames": 3000,
"padding_side": "right",
"padding_value": 0.0,
"processor_class": "WhisperProcessor",
"return_attention_mask": false,
"sampling_rate": 16000
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c5e006c003cf706f45b5a3e3c799b52ecdb73e19ca4bca341f78f1afebb03f91
size 14645

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cdafcd5466e416437736f2de3348c723ef6782558376b8c0e715495934cafe99
size 1465

View File

@@ -0,0 +1,139 @@
{
"additional_special_tokens": [
"<|startoftranscript|>",
"<|en|>",
"<|zh|>",
"<|de|>",
"<|es|>",
"<|ru|>",
"<|ko|>",
"<|fr|>",
"<|ja|>",
"<|pt|>",
"<|tr|>",
"<|pl|>",
"<|ca|>",
"<|nl|>",
"<|ar|>",
"<|sv|>",
"<|it|>",
"<|id|>",
"<|hi|>",
"<|fi|>",
"<|vi|>",
"<|he|>",
"<|uk|>",
"<|el|>",
"<|ms|>",
"<|cs|>",
"<|ro|>",
"<|da|>",
"<|hu|>",
"<|ta|>",
"<|no|>",
"<|th|>",
"<|ur|>",
"<|hr|>",
"<|bg|>",
"<|lt|>",
"<|la|>",
"<|mi|>",
"<|ml|>",
"<|cy|>",
"<|sk|>",
"<|te|>",
"<|fa|>",
"<|lv|>",
"<|bn|>",
"<|sr|>",
"<|az|>",
"<|sl|>",
"<|kn|>",
"<|et|>",
"<|mk|>",
"<|br|>",
"<|eu|>",
"<|is|>",
"<|hy|>",
"<|ne|>",
"<|mn|>",
"<|bs|>",
"<|kk|>",
"<|sq|>",
"<|sw|>",
"<|gl|>",
"<|mr|>",
"<|pa|>",
"<|si|>",
"<|km|>",
"<|sn|>",
"<|yo|>",
"<|so|>",
"<|af|>",
"<|oc|>",
"<|ka|>",
"<|be|>",
"<|tg|>",
"<|sd|>",
"<|gu|>",
"<|am|>",
"<|yi|>",
"<|lo|>",
"<|uz|>",
"<|fo|>",
"<|ht|>",
"<|ps|>",
"<|tk|>",
"<|nn|>",
"<|mt|>",
"<|sa|>",
"<|lb|>",
"<|my|>",
"<|bo|>",
"<|tl|>",
"<|mg|>",
"<|as|>",
"<|tt|>",
"<|haw|>",
"<|ln|>",
"<|ha|>",
"<|ba|>",
"<|jw|>",
"<|su|>",
"<|yue|>",
"<|translate|>",
"<|transcribe|>",
"<|startoflm|>",
"<|startofprev|>",
"<|nospeech|>",
"<|notimestamps|>"
],
"bos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,834 @@
{
"best_global_step": 750,
"best_metric": 0.48618392545595784,
"best_model_checkpoint": "results/finetuned/ML-ENG-LUG-FULL-A40/checkpoint-750",
"epoch": 3.613048978854148,
"eval_steps": 250,
"global_step": 2500,
"is_hyper_param_search": false,
"is_local_process_zero": true,
"is_world_process_zero": true,
"log_history": [
{
"epoch": 0.036146755828664376,
"grad_norm": 12.125,
"learning_rate": 1.6000000000000001e-06,
"loss": 1.2254,
"step": 25
},
{
"epoch": 0.07229351165732875,
"grad_norm": 9.5625,
"learning_rate": 3.266666666666667e-06,
"loss": 1.1396,
"step": 50
},
{
"epoch": 0.10844026748599313,
"grad_norm": 9.0,
"learning_rate": 4.933333333333334e-06,
"loss": 1.0373,
"step": 75
},
{
"epoch": 0.1445870233146575,
"grad_norm": 8.125,
"learning_rate": 6.600000000000001e-06,
"loss": 0.8837,
"step": 100
},
{
"epoch": 0.1807337791433219,
"grad_norm": 8.25,
"learning_rate": 8.266666666666667e-06,
"loss": 0.8438,
"step": 125
},
{
"epoch": 0.21688053497198626,
"grad_norm": 6.65625,
"learning_rate": 9.933333333333334e-06,
"loss": 0.8197,
"step": 150
},
{
"epoch": 0.2530272908006506,
"grad_norm": 6.71875,
"learning_rate": 9.897872340425532e-06,
"loss": 0.6942,
"step": 175
},
{
"epoch": 0.289174046629315,
"grad_norm": 6.75,
"learning_rate": 9.79148936170213e-06,
"loss": 0.8162,
"step": 200
},
{
"epoch": 0.3253208024579794,
"grad_norm": 7.25,
"learning_rate": 9.685106382978723e-06,
"loss": 0.7682,
"step": 225
},
{
"epoch": 0.3614675582866438,
"grad_norm": 4.75,
"learning_rate": 9.57872340425532e-06,
"loss": 0.7765,
"step": 250
},
{
"epoch": 0.3614675582866438,
"eval_cer": 0.3216240827923205,
"eval_loss": 0.8622527718544006,
"eval_runtime": 1448.6289,
"eval_samples_per_second": 0.957,
"eval_steps_per_second": 0.24,
"eval_wer": 0.5111065130432537,
"step": 250
},
{
"epoch": 0.3976143141153082,
"grad_norm": 7.65625,
"learning_rate": 9.472340425531916e-06,
"loss": 0.646,
"step": 275
},
{
"epoch": 0.4337610699439725,
"grad_norm": 5.5625,
"learning_rate": 9.365957446808511e-06,
"loss": 0.6401,
"step": 300
},
{
"epoch": 0.4699078257726369,
"grad_norm": 6.78125,
"learning_rate": 9.259574468085107e-06,
"loss": 0.7117,
"step": 325
},
{
"epoch": 0.5060545816013012,
"grad_norm": 7.25,
"learning_rate": 9.153191489361702e-06,
"loss": 0.7722,
"step": 350
},
{
"epoch": 0.5422013374299657,
"grad_norm": 5.875,
"learning_rate": 9.046808510638298e-06,
"loss": 0.6399,
"step": 375
},
{
"epoch": 0.57834809325863,
"grad_norm": 5.125,
"learning_rate": 8.940425531914895e-06,
"loss": 0.695,
"step": 400
},
{
"epoch": 0.6144948490872945,
"grad_norm": 4.8125,
"learning_rate": 8.83404255319149e-06,
"loss": 0.6709,
"step": 425
},
{
"epoch": 0.6506416049159588,
"grad_norm": 6.28125,
"learning_rate": 8.727659574468086e-06,
"loss": 0.7144,
"step": 450
},
{
"epoch": 0.6867883607446231,
"grad_norm": 5.875,
"learning_rate": 8.621276595744682e-06,
"loss": 0.7185,
"step": 475
},
{
"epoch": 0.7229351165732876,
"grad_norm": 6.84375,
"learning_rate": 8.514893617021277e-06,
"loss": 0.6632,
"step": 500
},
{
"epoch": 0.7229351165732876,
"eval_cer": 0.32359849188245976,
"eval_loss": 0.8226236701011658,
"eval_runtime": 1430.6804,
"eval_samples_per_second": 0.969,
"eval_steps_per_second": 0.243,
"eval_wer": 0.5051111231477887,
"step": 500
},
{
"epoch": 0.7590818724019519,
"grad_norm": 5.5,
"learning_rate": 8.408510638297873e-06,
"loss": 0.6509,
"step": 525
},
{
"epoch": 0.7952286282306164,
"grad_norm": 5.3125,
"learning_rate": 8.30212765957447e-06,
"loss": 0.6506,
"step": 550
},
{
"epoch": 0.8313753840592807,
"grad_norm": 6.9375,
"learning_rate": 8.195744680851064e-06,
"loss": 0.6635,
"step": 575
},
{
"epoch": 0.867522139887945,
"grad_norm": 7.90625,
"learning_rate": 8.08936170212766e-06,
"loss": 0.6494,
"step": 600
},
{
"epoch": 0.9036688957166095,
"grad_norm": 5.1875,
"learning_rate": 7.982978723404256e-06,
"loss": 0.6319,
"step": 625
},
{
"epoch": 0.9398156515452738,
"grad_norm": 4.125,
"learning_rate": 7.876595744680852e-06,
"loss": 0.6252,
"step": 650
},
{
"epoch": 0.9759624073739381,
"grad_norm": 3.90625,
"learning_rate": 7.770212765957447e-06,
"loss": 0.6612,
"step": 675
},
{
"epoch": 1.0115669618651726,
"grad_norm": 5.21875,
"learning_rate": 7.663829787234043e-06,
"loss": 0.5836,
"step": 700
},
{
"epoch": 1.047713717693837,
"grad_norm": 5.1875,
"learning_rate": 7.557446808510639e-06,
"loss": 0.6302,
"step": 725
},
{
"epoch": 1.0838604735225013,
"grad_norm": 5.3125,
"learning_rate": 7.4510638297872355e-06,
"loss": 0.5672,
"step": 750
},
{
"epoch": 1.0838604735225013,
"eval_cer": 0.3044346468112434,
"eval_loss": 0.8067195415496826,
"eval_runtime": 1358.1527,
"eval_samples_per_second": 1.021,
"eval_steps_per_second": 0.255,
"eval_wer": 0.48618392545595784,
"step": 750
},
{
"epoch": 1.1200072293511658,
"grad_norm": 5.65625,
"learning_rate": 7.34468085106383e-06,
"loss": 0.5476,
"step": 775
},
{
"epoch": 1.1561539851798301,
"grad_norm": 6.625,
"learning_rate": 7.2382978723404265e-06,
"loss": 0.6132,
"step": 800
},
{
"epoch": 1.1923007410084945,
"grad_norm": 4.03125,
"learning_rate": 7.131914893617022e-06,
"loss": 0.6052,
"step": 825
},
{
"epoch": 1.2284474968371588,
"grad_norm": 3.84375,
"learning_rate": 7.0255319148936175e-06,
"loss": 0.5723,
"step": 850
},
{
"epoch": 1.2645942526658231,
"grad_norm": 5.375,
"learning_rate": 6.919148936170214e-06,
"loss": 0.5791,
"step": 875
},
{
"epoch": 1.3007410084944877,
"grad_norm": 5.25,
"learning_rate": 6.8127659574468085e-06,
"loss": 0.5981,
"step": 900
},
{
"epoch": 1.336887764323152,
"grad_norm": 5.28125,
"learning_rate": 6.706382978723405e-06,
"loss": 0.5917,
"step": 925
},
{
"epoch": 1.3730345201518164,
"grad_norm": 5.65625,
"learning_rate": 6.600000000000001e-06,
"loss": 0.633,
"step": 950
},
{
"epoch": 1.4091812759804807,
"grad_norm": 4.78125,
"learning_rate": 6.493617021276596e-06,
"loss": 0.617,
"step": 975
},
{
"epoch": 1.445328031809145,
"grad_norm": 4.875,
"learning_rate": 6.387234042553192e-06,
"loss": 0.6058,
"step": 1000
},
{
"epoch": 1.445328031809145,
"eval_cer": 0.3142144715579215,
"eval_loss": 0.7990756034851074,
"eval_runtime": 1373.1379,
"eval_samples_per_second": 1.01,
"eval_steps_per_second": 0.253,
"eval_wer": 0.49487802577000256,
"step": 1000
},
{
"epoch": 1.4814747876378096,
"grad_norm": 5.09375,
"learning_rate": 6.2808510638297885e-06,
"loss": 0.6011,
"step": 1025
},
{
"epoch": 1.517621543466474,
"grad_norm": 4.625,
"learning_rate": 6.174468085106383e-06,
"loss": 0.5811,
"step": 1050
},
{
"epoch": 1.5537682992951383,
"grad_norm": 5.625,
"learning_rate": 6.0680851063829795e-06,
"loss": 0.6012,
"step": 1075
},
{
"epoch": 1.5899150551238026,
"grad_norm": 4.5,
"learning_rate": 5.961702127659575e-06,
"loss": 0.5822,
"step": 1100
},
{
"epoch": 1.626061810952467,
"grad_norm": 6.65625,
"learning_rate": 5.8553191489361705e-06,
"loss": 0.6002,
"step": 1125
},
{
"epoch": 1.6622085667811315,
"grad_norm": 4.78125,
"learning_rate": 5.748936170212767e-06,
"loss": 0.5965,
"step": 1150
},
{
"epoch": 1.6983553226097958,
"grad_norm": 5.34375,
"learning_rate": 5.6425531914893615e-06,
"loss": 0.6277,
"step": 1175
},
{
"epoch": 1.7345020784384602,
"grad_norm": 4.40625,
"learning_rate": 5.536170212765958e-06,
"loss": 0.5448,
"step": 1200
},
{
"epoch": 1.7706488342671245,
"grad_norm": 5.0625,
"learning_rate": 5.429787234042554e-06,
"loss": 0.5645,
"step": 1225
},
{
"epoch": 1.8067955900957888,
"grad_norm": 4.71875,
"learning_rate": 5.323404255319149e-06,
"loss": 0.6589,
"step": 1250
},
{
"epoch": 1.8067955900957888,
"eval_cer": 0.31060312404039775,
"eval_loss": 0.7972104549407959,
"eval_runtime": 1367.9273,
"eval_samples_per_second": 1.014,
"eval_steps_per_second": 0.254,
"eval_wer": 0.49013358950339214,
"step": 1250
},
{
"epoch": 1.8429423459244534,
"grad_norm": 4.0625,
"learning_rate": 5.217021276595745e-06,
"loss": 0.6027,
"step": 1275
},
{
"epoch": 1.8790891017531175,
"grad_norm": 5.46875,
"learning_rate": 5.110638297872342e-06,
"loss": 0.6039,
"step": 1300
},
{
"epoch": 1.915235857581782,
"grad_norm": 4.59375,
"learning_rate": 5.004255319148936e-06,
"loss": 0.6226,
"step": 1325
},
{
"epoch": 1.9513826134104464,
"grad_norm": 5.59375,
"learning_rate": 4.897872340425533e-06,
"loss": 0.5832,
"step": 1350
},
{
"epoch": 1.9875293692391107,
"grad_norm": 5.65625,
"learning_rate": 4.791489361702128e-06,
"loss": 0.5714,
"step": 1375
},
{
"epoch": 2.023133923730345,
"grad_norm": 5.25,
"learning_rate": 4.685106382978724e-06,
"loss": 0.553,
"step": 1400
},
{
"epoch": 2.0592806795590097,
"grad_norm": 4.3125,
"learning_rate": 4.57872340425532e-06,
"loss": 0.58,
"step": 1425
},
{
"epoch": 2.095427435387674,
"grad_norm": 5.34375,
"learning_rate": 4.4723404255319155e-06,
"loss": 0.5211,
"step": 1450
},
{
"epoch": 2.1315741912163384,
"grad_norm": 5.3125,
"learning_rate": 4.365957446808511e-06,
"loss": 0.5708,
"step": 1475
},
{
"epoch": 2.1677209470450025,
"grad_norm": 3.984375,
"learning_rate": 4.259574468085107e-06,
"loss": 0.5959,
"step": 1500
},
{
"epoch": 2.1677209470450025,
"eval_cer": 0.3117614241560084,
"eval_loss": 0.7976789474487305,
"eval_runtime": 1424.2468,
"eval_samples_per_second": 0.974,
"eval_steps_per_second": 0.244,
"eval_wer": 0.4926164455085106,
"step": 1500
},
{
"epoch": 2.203867702873667,
"grad_norm": 5.125,
"learning_rate": 4.153191489361703e-06,
"loss": 0.5694,
"step": 1525
},
{
"epoch": 2.2400144587023316,
"grad_norm": 4.53125,
"learning_rate": 4.046808510638298e-06,
"loss": 0.5767,
"step": 1550
},
{
"epoch": 2.2761612145309957,
"grad_norm": 5.3125,
"learning_rate": 3.940425531914894e-06,
"loss": 0.5827,
"step": 1575
},
{
"epoch": 2.3123079703596603,
"grad_norm": 4.75,
"learning_rate": 3.83404255319149e-06,
"loss": 0.5597,
"step": 1600
},
{
"epoch": 2.3484547261883244,
"grad_norm": 4.96875,
"learning_rate": 3.7276595744680857e-06,
"loss": 0.5605,
"step": 1625
},
{
"epoch": 2.384601482016989,
"grad_norm": 4.75,
"learning_rate": 3.621276595744681e-06,
"loss": 0.5523,
"step": 1650
},
{
"epoch": 2.4207482378456535,
"grad_norm": 4.84375,
"learning_rate": 3.5148936170212767e-06,
"loss": 0.5674,
"step": 1675
},
{
"epoch": 2.4568949936743176,
"grad_norm": 4.21875,
"learning_rate": 3.4085106382978726e-06,
"loss": 0.5801,
"step": 1700
},
{
"epoch": 2.493041749502982,
"grad_norm": 5.1875,
"learning_rate": 3.3021276595744685e-06,
"loss": 0.5634,
"step": 1725
},
{
"epoch": 2.5291885053316463,
"grad_norm": 3.53125,
"learning_rate": 3.195744680851064e-06,
"loss": 0.5402,
"step": 1750
},
{
"epoch": 2.5291885053316463,
"eval_cer": 0.3114490059221394,
"eval_loss": 0.7963515520095825,
"eval_runtime": 1412.53,
"eval_samples_per_second": 0.982,
"eval_steps_per_second": 0.246,
"eval_wer": 0.49261625004199694,
"step": 1750
},
{
"epoch": 2.565335261160311,
"grad_norm": 5.84375,
"learning_rate": 3.08936170212766e-06,
"loss": 0.5874,
"step": 1775
},
{
"epoch": 2.6014820169889754,
"grad_norm": 5.0,
"learning_rate": 2.9829787234042554e-06,
"loss": 0.5083,
"step": 1800
},
{
"epoch": 2.6376287728176395,
"grad_norm": 4.9375,
"learning_rate": 2.8765957446808514e-06,
"loss": 0.5587,
"step": 1825
},
{
"epoch": 2.673775528646304,
"grad_norm": 4.4375,
"learning_rate": 2.770212765957447e-06,
"loss": 0.5781,
"step": 1850
},
{
"epoch": 2.7099222844749686,
"grad_norm": 5.0,
"learning_rate": 2.663829787234043e-06,
"loss": 0.5224,
"step": 1875
},
{
"epoch": 2.7460690403036327,
"grad_norm": 5.0,
"learning_rate": 2.5574468085106387e-06,
"loss": 0.5491,
"step": 1900
},
{
"epoch": 2.782215796132297,
"grad_norm": 4.625,
"learning_rate": 2.4510638297872342e-06,
"loss": 0.5653,
"step": 1925
},
{
"epoch": 2.8183625519609614,
"grad_norm": 6.625,
"learning_rate": 2.34468085106383e-06,
"loss": 0.5236,
"step": 1950
},
{
"epoch": 2.854509307789626,
"grad_norm": 4.75,
"learning_rate": 2.2382978723404256e-06,
"loss": 0.5827,
"step": 1975
},
{
"epoch": 2.89065606361829,
"grad_norm": 5.15625,
"learning_rate": 2.1319148936170216e-06,
"loss": 0.5934,
"step": 2000
},
{
"epoch": 2.89065606361829,
"eval_cer": 0.3117611880607615,
"eval_loss": 0.7964433431625366,
"eval_runtime": 1409.3242,
"eval_samples_per_second": 0.984,
"eval_steps_per_second": 0.246,
"eval_wer": 0.4920525130602013,
"step": 2000
},
{
"epoch": 2.9268028194469546,
"grad_norm": 5.25,
"learning_rate": 2.025531914893617e-06,
"loss": 0.5579,
"step": 2025
},
{
"epoch": 2.962949575275619,
"grad_norm": 5.21875,
"learning_rate": 1.919148936170213e-06,
"loss": 0.5838,
"step": 2050
},
{
"epoch": 2.9990963311042833,
"grad_norm": 4.5625,
"learning_rate": 1.8127659574468087e-06,
"loss": 0.5663,
"step": 2075
},
{
"epoch": 3.0347008855955178,
"grad_norm": 5.6875,
"learning_rate": 1.7063829787234042e-06,
"loss": 0.5703,
"step": 2100
},
{
"epoch": 3.0708476414241823,
"grad_norm": 4.875,
"learning_rate": 1.6000000000000001e-06,
"loss": 0.5703,
"step": 2125
},
{
"epoch": 3.1069943972528464,
"grad_norm": 4.875,
"learning_rate": 1.4936170212765956e-06,
"loss": 0.5774,
"step": 2150
},
{
"epoch": 3.143141153081511,
"grad_norm": 4.84375,
"learning_rate": 1.3872340425531916e-06,
"loss": 0.5686,
"step": 2175
},
{
"epoch": 3.179287908910175,
"grad_norm": 3.96875,
"learning_rate": 1.2808510638297875e-06,
"loss": 0.4906,
"step": 2200
},
{
"epoch": 3.2154346647388397,
"grad_norm": 4.78125,
"learning_rate": 1.174468085106383e-06,
"loss": 0.5839,
"step": 2225
},
{
"epoch": 3.251581420567504,
"grad_norm": 5.09375,
"learning_rate": 1.0680851063829787e-06,
"loss": 0.5464,
"step": 2250
},
{
"epoch": 3.251581420567504,
"eval_cer": 0.3112543561914034,
"eval_loss": 0.7959883213043213,
"eval_runtime": 1614.2453,
"eval_samples_per_second": 0.859,
"eval_steps_per_second": 0.215,
"eval_wer": 0.4931348188600785,
"step": 2250
},
{
"epoch": 3.2877281763961683,
"grad_norm": 4.40625,
"learning_rate": 9.617021276595744e-07,
"loss": 0.5668,
"step": 2275
},
{
"epoch": 3.323874932224833,
"grad_norm": 4.3125,
"learning_rate": 8.553191489361703e-07,
"loss": 0.5077,
"step": 2300
},
{
"epoch": 3.3600216880534974,
"grad_norm": 5.78125,
"learning_rate": 7.489361702127661e-07,
"loss": 0.5428,
"step": 2325
},
{
"epoch": 3.3961684438821615,
"grad_norm": 5.625,
"learning_rate": 6.425531914893618e-07,
"loss": 0.5428,
"step": 2350
},
{
"epoch": 3.432315199710826,
"grad_norm": 5.25,
"learning_rate": 5.361702127659575e-07,
"loss": 0.5395,
"step": 2375
},
{
"epoch": 3.46846195553949,
"grad_norm": 5.1875,
"learning_rate": 4.2978723404255325e-07,
"loss": 0.5461,
"step": 2400
},
{
"epoch": 3.5046087113681548,
"grad_norm": 5.03125,
"learning_rate": 3.2340425531914897e-07,
"loss": 0.5538,
"step": 2425
},
{
"epoch": 3.540755467196819,
"grad_norm": 5.4375,
"learning_rate": 2.170212765957447e-07,
"loss": 0.5298,
"step": 2450
},
{
"epoch": 3.5769022230254834,
"grad_norm": 5.21875,
"learning_rate": 1.1063829787234043e-07,
"loss": 0.5959,
"step": 2475
},
{
"epoch": 3.613048978854148,
"grad_norm": 5.5,
"learning_rate": 4.25531914893617e-09,
"loss": 0.5497,
"step": 2500
},
{
"epoch": 3.613048978854148,
"eval_cer": 0.31241486923400996,
"eval_loss": 0.7960155010223389,
"eval_runtime": 1533.3812,
"eval_samples_per_second": 0.905,
"eval_steps_per_second": 0.226,
"eval_wer": 0.49320595826893987,
"step": 2500
}
],
"logging_steps": 25,
"max_steps": 2500,
"num_input_tokens_seen": 0,
"num_train_epochs": 4,
"save_steps": 250,
"stateful_callbacks": {
"TrainerControl": {
"args": {
"should_epoch_stop": false,
"should_evaluate": false,
"should_log": false,
"should_save": true,
"should_training_stop": true
},
"attributes": {}
}
},
"total_flos": 1.3582858479796224e+20,
"train_batch_size": 2,
"trial_name": null,
"trial_params": null
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fbfa14c5c1204912af0e50df2a7b7288485ad9a07060abda9361ef38a28ed2a0
size 6033

50259
last-checkpoint/vocab.json Normal file

File diff suppressed because it is too large Load Diff

50001
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5e89370d0a959ccc3a1a2ae6b89a71ce79df01540025c58061696052d928b5f3
size 3087132232

1742
normalizer.json Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

2041
predictions.tsv Normal file

File diff suppressed because it is too large Load Diff

15
preprocessor_config.json Normal file
View File

@@ -0,0 +1,15 @@
{
"chunk_length": 30,
"dither": 0.0,
"feature_extractor_type": "WhisperFeatureExtractor",
"feature_size": 128,
"hop_length": 160,
"n_fft": 400,
"n_samples": 480000,
"nb_max_frames": 3000,
"padding_side": "right",
"padding_value": 0.0,
"processor_class": "WhisperProcessor",
"return_attention_mask": false,
"sampling_rate": 16000
}

139
special_tokens_map.json Normal file
View File

@@ -0,0 +1,139 @@
{
"additional_special_tokens": [
"<|startoftranscript|>",
"<|en|>",
"<|zh|>",
"<|de|>",
"<|es|>",
"<|ru|>",
"<|ko|>",
"<|fr|>",
"<|ja|>",
"<|pt|>",
"<|tr|>",
"<|pl|>",
"<|ca|>",
"<|nl|>",
"<|ar|>",
"<|sv|>",
"<|it|>",
"<|id|>",
"<|hi|>",
"<|fi|>",
"<|vi|>",
"<|he|>",
"<|uk|>",
"<|el|>",
"<|ms|>",
"<|cs|>",
"<|ro|>",
"<|da|>",
"<|hu|>",
"<|ta|>",
"<|no|>",
"<|th|>",
"<|ur|>",
"<|hr|>",
"<|bg|>",
"<|lt|>",
"<|la|>",
"<|mi|>",
"<|ml|>",
"<|cy|>",
"<|sk|>",
"<|te|>",
"<|fa|>",
"<|lv|>",
"<|bn|>",
"<|sr|>",
"<|az|>",
"<|sl|>",
"<|kn|>",
"<|et|>",
"<|mk|>",
"<|br|>",
"<|eu|>",
"<|is|>",
"<|hy|>",
"<|ne|>",
"<|mn|>",
"<|bs|>",
"<|kk|>",
"<|sq|>",
"<|sw|>",
"<|gl|>",
"<|mr|>",
"<|pa|>",
"<|si|>",
"<|km|>",
"<|sn|>",
"<|yo|>",
"<|so|>",
"<|af|>",
"<|oc|>",
"<|ka|>",
"<|be|>",
"<|tg|>",
"<|sd|>",
"<|gu|>",
"<|am|>",
"<|yi|>",
"<|lo|>",
"<|uz|>",
"<|fo|>",
"<|ht|>",
"<|ps|>",
"<|tk|>",
"<|nn|>",
"<|mt|>",
"<|sa|>",
"<|lb|>",
"<|my|>",
"<|bo|>",
"<|tl|>",
"<|mg|>",
"<|as|>",
"<|tt|>",
"<|haw|>",
"<|ln|>",
"<|ha|>",
"<|ba|>",
"<|jw|>",
"<|su|>",
"<|yue|>",
"<|translate|>",
"<|transcribe|>",
"<|startoflm|>",
"<|startofprev|>",
"<|nospeech|>",
"<|notimestamps|>"
],
"bos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

9
test_results.json Normal file
View File

@@ -0,0 +1,9 @@
{
"epoch": 3.613048978854148,
"test_cer": 0.18876529976505252,
"test_loss": 0.5511467456817627,
"test_runtime": 1854.5701,
"test_samples_per_second": 1.1,
"test_steps_per_second": 0.275,
"test_wer": 0.34776840202900167
}

12997
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff

8
train_results.json Normal file
View File

@@ -0,0 +1,8 @@
{
"epoch": 3.613048978854148,
"total_flos": 1.3582858479796224e+20,
"train_loss": 0.6222768672943115,
"train_runtime": 23999.707,
"train_samples_per_second": 1.667,
"train_steps_per_second": 0.104
}

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fbfa14c5c1204912af0e50df2a7b7288485ad9a07060abda9361ef38a28ed2a0
size 6033

50259
vocab.json Normal file

File diff suppressed because it is too large Load Diff