EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

This repository has been archived on 2025-08-26. You can view files and clone it, but cannot push or open issues or pull requests.

Go to file

Fangjun Kuang 9a68b92ce6 Increase CED's max frame length to 3000 (#798 )

so that it can process waves for up to 30 seconds.

2024-04-22 10:18:47 +08:00

.github

Support CED models (#792 )

2024-04-19 15:20:37 +08:00

android

Support CED models (#792 )

2024-04-19 15:20:37 +08:00

c-api-examples

Add C API for punctuation (#768 )

2024-04-14 19:02:34 +08:00

cmake

Add jieba for Chinese TTS models (#797 )

2024-04-21 14:47:13 +08:00

dotnet-examples

Support heteronyms in Chinese TTS (#738 )

2024-04-08 11:01:30 +08:00

ffmpeg-examples

Fix typos in .Net APIs (#156 )

2023-05-14 22:32:01 +08:00

go-api-examples

Add Python API for punctuation models. (#762 )

2024-04-13 13:28:17 +08:00

ios-swift

Add Go API for speaker identification (#718 )

2024-03-29 19:25:55 +08:00

ios-swiftui

Support heteronyms in Chinese TTS (#738 )

2024-04-08 11:01:30 +08:00

java-api-examples

Fix #608 (#610 )

2024-02-26 13:49:37 +08:00

kotlin-api-examples

Add Android demo for spoken language identification using Whisper multilingual models (#783 )

2024-04-18 14:33:59 +08:00

mfc-examples

Support heteronyms in Chinese TTS (#738 )

2024-04-08 11:01:30 +08:00

nodejs-examples

Add WearOS demo for audio tagging (#777 )

2024-04-17 12:22:17 +08:00

python-api-examples

Add Python API example for CED audio tagging. (#793 )

2024-04-19 18:33:18 +08:00

scripts

Increase CED's max frame length to 3000 (#798 )

2024-04-22 10:18:47 +08:00

sherpa-onnx

Add jieba for Chinese TTS models (#797 )

2024-04-21 14:47:13 +08:00

swift-api-examples

Support heteronyms in Chinese TTS (#738 )

2024-04-08 11:01:30 +08:00

toolchains

Support RISC-V (#609 )

2024-02-26 06:57:18 +08:00

wasm

Add WearOS demo for audio tagging (#777 )

2024-04-17 12:22:17 +08:00

.clang-format

add java wrapper suppport (#117 )

2023-04-15 22:17:28 +08:00

.flake8

add offline websocket server/client (#98 )

2023-03-29 21:48:45 +08:00

.gitignore

Add JNI support for spoken language identification (#782 )

2024-04-17 19:27:15 +08:00

build-aarch64-linux-gnu.sh

return timestamps for WebAssembly (#737 )

2024-04-05 20:24:27 +08:00

build-android-arm64-v8a.sh

support onnxruntime v1.17.1 (#624 )

2024-03-02 11:44:59 +08:00

build-android-armv7-eabi.sh

support onnxruntime v1.17.1 (#624 )

2024-03-02 11:44:59 +08:00

build-android-x86-64.sh

support onnxruntime v1.17.1 (#624 )

2024-03-02 11:44:59 +08:00

build-android-x86.sh

support onnxruntime v1.17.1 (#624 )

2024-03-02 11:44:59 +08:00

build-apk-two-pass.sh

Fix whisper test script for the latest onnxruntime. (#494 )

2023-12-20 11:12:12 +08:00

build-apk-vad.sh

Add Android APK for Silero VAD (#335 )

2023-09-23 20:39:13 +08:00

build-apk.sh

Release pre-built APKs (#285 )

2023-08-18 14:28:44 +08:00

build-arm-linux-gnueabihf.sh

return timestamps for WebAssembly (#737 )

2024-04-05 20:24:27 +08:00

build-ios-no-tts.sh

Support including TTS conditionally. (#699 )

2024-03-26 17:21:35 +08:00

build-ios.sh

Add Python API and Python examples for audio tagging (#753 )

2024-04-11 11:12:48 +08:00

build-kws-apk.sh

change modelscope link to github for build-kws-apki (#540 )

2024-01-24 16:40:14 +08:00

build-riscv64-linux-gnu.sh

Support using T-head-Semi/csi-nn2 for RISC-V (#637 )

2024-03-06 18:21:50 +08:00

build-swift-macos.sh

Support heteronyms in Chinese TTS (#738 )

2024-04-08 11:01:30 +08:00

build-wasm-simd-asr.sh

Add WebAssembly for ASR (#604 )

2024-02-23 17:39:11 +08:00

build-wasm-simd-kws.sh

small fixes to wasm kws. (#672 )

2024-03-18 15:28:10 +08:00

build-wasm-simd-nodejs.sh

return timestamps for WebAssembly (#737 )

2024-04-05 20:24:27 +08:00

build-wasm-simd-tts.sh

Add WebAssembly for ASR (#604 )

2024-02-23 17:39:11 +08:00

CMakeLists.txt

Add jieba for Chinese TTS models (#797 )

2024-04-21 14:47:13 +08:00

CPPLINT.cfg

Use static libraries for MFC examples (#210 )

2023-07-13 14:52:43 +08:00

LICENSE

Use standard apache 2.0 license (#53 )

2023-02-22 11:30:46 +08:00

MANIFEST.in

Fix building wheels from source. (#632 )

2024-03-04 16:39:51 +08:00

README.md

Add C++ microphone examples for audio tagging (#749 )

2024-04-10 21:00:35 +08:00

release.sh

Publish pre-compiled libs for Android. (#217 )

2023-07-15 12:25:18 +08:00

setup.py

Support spoken language identification with whisper (#694 )

2024-03-24 22:57:00 +08:00

README.md

Introduction

This repository supports running the following functions locally

Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
Text-to-speech (i.e., TTS)
Speaker identification
Speaker verification
Spoken language identification
Audio tagging
VAD (e.g., silero-vad)

on the following platforms and operating systems:

x86, x86_64, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)
Linux, macOS, Windows, openKylin
Android, WearOS
iOS
NodeJS
WebAssembly
Raspberry Pi
RV1126
LicheePi4A
VisionFive 2
旭日X3派
etc

with the following APIs

C++
C
Python
Go
C#
Javascript
Java
Kotlin
Swift

Useful links

Documentation: https://k2-fsa.github.io/sherpa/onnx/
APK for the text-to-speech engine: https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html
APK for speaker identification: https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html
APK for speech recognition: https://github.com/k2-fsa/sherpa-onnx/releases/
Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi

How to reach us

Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.

Languages

C++ 38.3%

Python 16.3%

Shell 7.6%

Kotlin 5.1%

JavaScript 5.1%

Other 27.4%