7e485fad5921e86c5f309d2db78cbd963b370658
Model: dvislobokov/whisper-large-v3-turbo-russian Source: Original Platform
license, datasets, language, base_model, pipeline_tag, metrics, library_name, tags
| license | datasets | language | base_model | pipeline_tag | metrics | library_name | tags | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mit |
|
|
|
automatic-speech-recognition |
|
transformers |
|
This model whas trained with two A100 40 GB, 128 GB RAM and 2 x Xeon 48 Core 2.4 GHz
- Time spent ~ 7 hours
- Count of train dataset - 118k of audio samples from Mozilla Common Voice 17
Example of usage
from transformers import pipeline
import gradio as gr
import time
pipe = pipeline(
model="dvislobokov/whisper-large-v3-turbo-russian",
tokenizer="dvislobokov/whisper-large-v3-turbo-russian",
task='automatic-speech-recognition',
device='cpu'
)
def transcribe(audio):
start = time.time()
text = pipe(audio, return_timestamps=True)['text']
print(time.time() - start)
return text
iface = gr.Interface(
fn=transcribe,
inputs=gr.Audio(sources=['microphone', 'upload'], type='filepath'),
outputs='text'
)
iface.launch(share=True)
Description
Languages
Text
100%