This repository has been archived on 2025-08-26 . You can view files and clone it, but cannot push or open issues or pull requests.
9a68b92ce6f42e7605266fc092e05435a28da246
so that it can process waves for up to 30 seconds.
Introduction
This repository supports running the following functions locally
- Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
- Text-to-speech (i.e., TTS)
- Speaker identification
- Speaker verification
- Spoken language identification
- Audio tagging
- VAD (e.g., silero-vad)
on the following platforms and operating systems:
- x86,
x86_64, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64) - Linux, macOS, Windows, openKylin
- Android, WearOS
- iOS
- NodeJS
- WebAssembly
- Raspberry Pi
- RV1126
- LicheePi4A
- VisionFive 2
- 旭日X3派
- etc
with the following APIs
- C++
- C
- Python
- Go
C#- Javascript
- Java
- Kotlin
- Swift
Useful links
- Documentation: https://k2-fsa.github.io/sherpa/onnx/
- APK for the text-to-speech engine: https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html
- APK for speaker identification: https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html
- APK for speech recognition: https://github.com/k2-fsa/sherpa-onnx/releases/
- Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi
How to reach us
Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.
Description
Languages
C++
38.3%
Python
16.3%
Shell
7.6%
Kotlin
5.1%
JavaScript
5.1%
Other
27.4%