Go to file

ModelHub XC 59ae840aff 初始化项目，由ModelHub XC社区提供模型

Model: bond005/whisper-large-v3-ru-podlodka
Source: Original Platform

2026-05-12 12:23:54 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

anna_matveeva_test.wav

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

model-00001-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

model-00002-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

model.safetensors.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

normalizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

preprocessor_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

pytorch_model-00001-of-00002.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

pytorch_model-00002-of-00002.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

pytorch_model.bin.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

test_sound_ru.flac

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

test_sound_with_noise.wav

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-12 12:23:54 +08:00

README.md

datasets, language, license, metrics, pipeline_tag, library_name, widget, model-index

datasets

language

license

metrics

pipeline_tag

library_name

widget

model-index

bond005/taiga_speech_v2

bond005/podlodka_speech

bond005/rulibrispeech

apache-2.0

wer

automatic-speech-recognition

transformers

example_title	src
Нейронные сети - это хорошо!	https://huggingface.co/bond005/whisper-large-v3-ru-podlodka/resolve/main/test_sound_ru.flac

example_title	src
К сожалению, система распознавания речи не всегда стабильна, особенно в шумных условиях.	https://huggingface.co/bond005/whisper-large-v3-ru-podlodka/resolve/main/test_sound_with_noise.wav

example_title	src
Мимо театра мальчик ходил довольно часто — белое, со взбитыми сливками, здание-торт.	https://huggingface.co/bond005/whisper-large-v3-ru-podlodka/resolve/main/anna_matveeva_test.wav

name

results

Whisper Large V3 Russian Podlodka by Ivan Bondarenko

task

dataset

metrics

type	name
automatic-speech-recognition	Speech Recognition

name	type	args
Podlodka.io	bond005/podlodka_speech	ru

type	value	name
wer	20.91	WER (with punctuation and capital letters)

type	value	name
wer	10.987	WER (without punctuation)

task

dataset

metrics

type	name
automatic-speech-recognition	Speech Recognition

name	type	args
Russian Librispeech	bond005/rulibrispeech	ru

type	value	name
wer	9.795	WER (without punctuation)

Whisper Large V3 Russian Podlodka

This repository contains a fine-tuned Whisper Large V3 model for Russian speech recognition. It serves as the core transcription component of the Pisets system, specifically optimized for long audio recordings such as lectures and interviews.

The model was presented in the paper Pisets: A Robust Speech Recognition System for Lectures and Interviews.

System Architecture

The Pisets system implements a three-component architecture to improve recognition accuracy while minimizing hallucinations:

Wav2Vec2: For primary recognition and segmentation.
Audio Spectrogram Transformer (AST): For filtering non-speech segments.
Whisper (this model): For the final high-quality transcription.

Implementation

The complete source code and instructions for using the system (including generation of SRT and DocX files) can be found in the GitHub repository:

GitHub: https://github.com/bond005/pisets

Citation

If you use this model or the Pisets system in your research, please cite:

@article{bondarenko2026pisets,
  title={Pisets: A Robust Speech Recognition System for Lectures and Interviews},
  author={Ivan Bondarenko},
  journal={arXiv preprint arXiv:2601.18415},
  year={2026}
}

README.md Unescape Escape

Whisper Large V3 Russian Podlodka

System Architecture

Implementation

Citation

README.md