初始化项目，由ModelHub XC社区提供模型

Model: dvislobokov/whisper-large-v3-turbo-russian Source: Original Platform
2026-05-14 11:43:24 +08:00
commit 7e485fad59
15 changed files with 117148 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,48 @@
+---
+license: mit
+datasets:
+- mozilla-foundation/common_voice_17_0
+language:
+- ru
+base_model:
+- openai/whisper-large-v3-turbo
+pipeline_tag: automatic-speech-recognition
+metrics:
+- accuracy
+library_name: transformers
+tags:
+- call
+---
+
+### This model whas trained with two A100 40 GB, 128 GB RAM and 2 x Xeon 48 Core 2.4 GHz
+- Time spent ~ 7 hours
+- Count of train dataset - 118k of audio samples from Mozilla Common Voice 17
+---
+Example of usage
+```python
+from transformers import pipeline
+import gradio as gr
+import time
+
+pipe = pipeline(
+    model="dvislobokov/whisper-large-v3-turbo-russian",
+    tokenizer="dvislobokov/whisper-large-v3-turbo-russian",
+    task='automatic-speech-recognition',
+    device='cpu'
+)
+
+def transcribe(audio):
+    start = time.time()
+    text = pipe(audio, return_timestamps=True)['text']
+    print(time.time() - start)
+    return text
+
+iface = gr.Interface(
+    fn=transcribe,
+    inputs=gr.Audio(sources=['microphone', 'upload'], type='filepath'),
+    outputs='text'
+)
+
+iface.launch(share=True)
+
+```