enginex-mr_series-sherpa-onnx

EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

This repository has been archived on 2025-08-26. You can view files and clone it, but cannot push or open issues or pull requests.

Go to file

Fangjun Kuang 70ee779410 Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630 )

* Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX.

The pre-built onnxruntime libs are provided by the community
using the following command:

```bash
./build.sh --build_shared_lib --config Release --update \
  --build --parallel --use_cuda \
  --cuda_home /usr/local/cuda \
  --cudnn_home /usr/lib/aarch64-linux-gnu 2>&1 | tee my-log.txt
```

See also https://github.com/microsoft/onnxruntime/discussions/11226

---

Info about the board:

```
Model: NVIDIA Orin NX T801-16GB - Jetpack 5.1.4 [L4T 35.6.0]
```

```
nvidia@nvidia-desktop:~/Downloads$ head -n 1 /etc/nv_tegra_release
# R35 (release), REVISION: 6.0, GCID: 37391689, BOARD: t186ref, EABI: aarch64, DATE: Wed Aug 28 09:12:27 UTC 2024

nvidia@nvidia-desktop:~/Downloads$ uname -r
5.10.216-tegra

nvidia@nvidia-desktop:~/Downloads$ lsb_release -i -r
Distributor ID:	Ubuntu
Release:	20.04

nvidia@nvidia-desktop:~/Downloads$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:43:33_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

nvidia@nvidia-desktop:~/Downloads$ dpkg -l libcudnn8
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version              Architecture Description
+++-==============-====================-============-=================================
ii  libcudnn8      8.6.0.166-1+cuda11.4 arm64        cuDNN runtime libraries

nvidia@nvidia-desktop:~/Downloads$ dpkg -l tensorrt
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version            Architecture Description
+++-==============-==================-============-=================================
ii  tensorrt       8.5.2.2-1+cuda11.4 arm64        Meta package for TensorRT
```

2024-12-19 18:19:53 +08:00

.github

Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630 )

2024-12-19 18:19:53 +08:00

android

Update AAR version in Android Java demo (#1618 )

2024-12-12 20:51:57 +08:00

c-api-examples

Reduce vad-sense-voice example code. (#1510 )

2024-11-05 20:34:12 +08:00

cmake

Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630 )

2024-12-19 18:19:53 +08:00

cxx-api-examples

Publish pre-built wheels with CUDA support for Linux aarch64. (#1507 )

2024-11-03 19:15:11 +08:00

dart-api-examples

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

dotnet-examples

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

ffmpeg-examples

Fix style issues (#1458 )

2024-10-24 11:15:08 +08:00

flutter

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

flutter-examples

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

go-api-examples

🔧 build(portaudio-go): Fixed version 1.0.3 (#1614 )

2024-12-12 19:39:43 +08:00

harmony-os

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

ios-swift

Revert to onnxruntime 1.17.1 (#1131 )

2024-07-15 14:24:08 +08:00

ios-swiftui

Add MeloTTS example for ios (#1223 )

2024-08-06 14:48:54 +08:00

java-api-examples

Add Kotlin and Java API for Moonshine models (#1474 )

2024-10-26 22:30:29 +08:00

kotlin-api-examples

Provide sherpa-onnx.aar for Android (#1615 )

2024-12-12 16:59:00 +08:00

lazarus-examples

Add Lazarus example for Moonshine models. (#1532 )

2024-11-13 00:04:16 +08:00

mfc-examples

Add C++ API for non-streaming ASR (#1456 )

2024-10-23 16:40:12 +08:00

nodejs-addon-examples

Rename maxNumStences to maxNumSentences (#1625 )

2024-12-16 22:37:59 +08:00

nodejs-examples

Add JavaScript API for Moonshine models (#1480 )

2024-10-27 11:31:01 +08:00

pascal-api-examples

Add Pascal API for Moonshine models (#1482 )

2024-10-27 12:21:16 +08:00

python-api-examples

'update20241203' (#1589 )

2024-12-04 09:22:24 +08:00

rust-api-examples

Update README to include Rust. (#1212 )

2024-08-04 12:20:05 +08:00

scripts

Add speaker diarization API for HarmonyOS. (#1609 )

2024-12-10 16:03:03 +08:00

sherpa-onnx

Support linking onnxruntime statically for Android (#1619 )

2024-12-14 09:53:44 +08:00

swift-api-examples

Add Swift API for Moonshine models. (#1477 )

2024-10-27 08:19:01 +08:00

toolchains

Support RISC-V (#609 )

2024-02-26 06:57:18 +08:00

wasm

Add WebAssembly example for VAD + Moonshine models. (#1535 )

2024-11-13 21:06:50 +08:00

.clang-format

add java wrapper suppport (#117 )

2023-04-15 22:17:28 +08:00

.clang-tidy

Support clang-tidy (#1034 )

2024-06-19 20:51:57 +08:00

.flake8

add offline websocket server/client (#98 )

2023-03-29 21:48:45 +08:00

.gitignore

Add speaker identification APIs for HarmonyOS (#1607 )

2024-12-09 19:23:18 +08:00

build-aarch64-linux-gnu.sh

Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630 )

2024-12-19 18:19:53 +08:00

build-android-arm64-v8a.sh

Support linking onnxruntime statically for Android (#1619 )

2024-12-14 09:53:44 +08:00

build-android-armv7-eabi.sh

Support linking onnxruntime statically for Android (#1619 )

2024-12-14 09:53:44 +08:00

build-android-x86-64.sh

Support linking onnxruntime statically for Android (#1619 )

2024-12-14 09:53:44 +08:00

build-android-x86.sh

Support linking onnxruntime statically for Android (#1619 )

2024-12-14 09:53:44 +08:00

build-arm-linux-gnueabihf.sh

Build websocket related binaries for embedded systems. (#1327 )

2024-09-08 17:16:58 +08:00

build-ios-no-tts.sh

Add blank penalty for various language bindings. (#1234 )

2024-08-08 10:43:31 +08:00

build-ios-shared.sh

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

build-ios.sh

Add MeloTTS example for ios (#1223 )

2024-08-06 14:48:54 +08:00

build-ohos-arm64-v8a.sh

Publish sherpa_onnx.har for HarmonyOS (#1572 )

2024-11-28 17:30:16 +08:00

build-ohos-armeabi-v7a.sh

HarmonyOS support for VAD. (#1561 )

2024-11-24 16:29:24 +08:00

build-ohos-x86-64.sh

Publish sherpa_onnx.har for HarmonyOS (#1572 )

2024-11-28 17:30:16 +08:00

build-riscv64-linux-gnu.sh

Build websocket related binaries for embedded systems. (#1327 )

2024-09-08 17:16:58 +08:00

build-swift-macos.sh

Publish pre-built macos xcframework (#1490 )

2024-10-29 12:26:26 +08:00

build-wasm-simd-asr.sh

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

build-wasm-simd-kws.sh

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

build-wasm-simd-nodejs.sh

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

build-wasm-simd-speaker-diarization.sh

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

build-wasm-simd-tts.sh

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

build-wasm-simd-vad-asr.sh

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

build-wasm-simd-vad.sh

Removed unused TTS example code in .Net examples (#1492 )

2024-10-29 14:59:12 +08:00

CHANGELOG.md

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

CMakeLists.txt

Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630 )

2024-12-19 18:19:53 +08:00

CPPLINT.cfg

Use static libraries for MFC examples (#210 )

2023-07-13 14:52:43 +08:00

jitpack.yml

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

LICENSE

Use standard apache 2.0 license (#53 )

2023-02-22 11:30:46 +08:00

MANIFEST.in

Fix building wheels from source. (#632 )

2024-03-04 16:39:51 +08:00

new-release.sh

Update AAR version in Android Java demo (#1618 )

2024-12-12 20:51:57 +08:00

pom.xml

Release v1.10.35 (#1617 )

2024-12-12 20:07:47 +08:00

README.md

Update readme to include Open-LLM-VTuber (#1622 )

2024-12-16 10:47:07 +08:00

release.sh

Publish pre-compiled libs for Android. (#217 )

2023-07-15 12:25:18 +08:00

setup.py

Provide pre-built wheels with CUDA support. (#1143 )

2024-07-17 22:59:13 +08:00

README.md

Supported functions

Speech recognition	Speech synthesis
✔️	✔️

Speaker identification	Speaker diarization	Speaker verification
✔️	✔️	✔️

Spoken Language identification	Audio tagging	Voice activity detection
✔️	✔️	✔️

Keyword spotting	Add punctuation
✔️	✔️

Supported platforms

Architecture	Android	iOS	Windows	macOS	linux	HarmonyOS
x64	✔️		✔️	✔️	✔️	✔️
x86	✔️		✔️
arm64	✔️	✔️	✔️	✔️	✔️	✔️
arm32	✔️				✔️	✔️
riscv64					✔️

Supported programming languages

1. C++	2. C	3. Python	4. JavaScript
✔️	✔️	✔️	✔️

5. Java	6. C#	7. Kotlin	8. Swift
✔️	✔️	✔️	✔️

9. Go	10. Dart	11. Rust	12. Pascal
✔️	✔️	✔️	✔️

For Rust support, please see sherpa-rs

It also supports WebAssembly.

Introduction

This repository supports running the following functions locally

Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
Text-to-speech (i.e., TTS)
Speaker diarization
Speaker identification
Speaker verification
Spoken language identification
Audio tagging
VAD (e.g., silero-vad)
Keyword spotting

on the following platforms and operating systems:

x86, x86_64, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)
Linux, macOS, Windows, openKylin
Android, WearOS
iOS
HarmonyOS
NodeJS
WebAssembly
Raspberry Pi
RV1126
LicheePi4A
VisionFive 2
旭日X3派
爱芯派
etc

with the following APIs

C++, C, Python, Go, C#
Java, Kotlin, JavaScript
Swift, Rust
Dart, Object Pascal

Links for Huggingface Spaces

You can visit the following Huggingface spaces to try sherpa-onnx without installing anything. All you need is a browser.

Description	URL
Speaker diarization	Click me
Speech recognition	Click me
Speech recognition with Whisper	Click me
Speech synthesis	Click me
Generate subtitles	Click me
Audio tagging	Click me
Spoken language identification with Whisper	Click me

We also have spaces built using WebAssembly. They are listed below:

Description	Huggingface space	ModelScope space
Voice activity detection with silero-vad	Click me	地址
Real-time speech recognition (Chinese + English) with Zipformer	Click me	地址
Real-time speech recognition (Chinese + English) with Paraformer	Click me	地址
Real-time speech recognition (Chinese + English + Cantonese) with Paraformer-large	Click me	地址
Real-time speech recognition (English)	Click me	地址
VAD + speech recognition (Chinese + English + Korean + Japanese + Cantonese) with SenseVoice	Click me	地址
VAD + speech recognition (English) with Whisper tiny.en	Click me	地址
VAD + speech recognition (English) with Moonshine tiny	Click me	地址
VAD + speech recognition (English) with Zipformer trained with GigaSpeech	Click me	地址
VAD + speech recognition (Chinese) with Zipformer trained with WenetSpeech	Click me	地址
VAD + speech recognition (Japanese) with Zipformer trained with ReazonSpeech	Click me	地址
VAD + speech recognition (Thai) with Zipformer trained with GigaSpeech2	Click me	地址
VAD + speech recognition (Chinese 多种方言) with a TeleSpeech-ASR CTC model	Click me	地址
VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-large	Click me	地址
VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-small	Click me	地址
Speech synthesis (English)	Click me	地址
Speech synthesis (German)	Click me	地址
Speaker diarization	Click me	地址

Links for pre-built Android APKs

You can find pre-built Android APKs for this repository in the following table

Description	URL	中国用户
Speaker diarization	Address	点此
Streaming speech recognition	Address	点此
Text-to-speech	Address	点此
Voice activity detection (VAD)	Address	点此
VAD + non-streaming speech recognition	Address	点此
Two-pass speech recognition	Address	点此
Audio tagging	Address	点此
Audio tagging (WearOS)	Address	点此
Speaker identification	Address	点此
Spoken language identification	Address	点此
Keyword spotting	Address	点此

Links for pre-built Flutter APPs

Real-time speech recognition

Description	URL	中国用户
Streaming speech recognition	Address	点此

Text-to-speech

Description	URL	中国用户
Android (arm64-v8a, armeabi-v7a, x86_64)	Address	点此
Linux (x64)	Address	点此
macOS (x64)	Address	点此
macOS (arm64)	Address	点此
Windows (x64)	Address	点此

Note: You need to build from source for iOS.

Links for pre-built Lazarus APPs

Generating subtitles

Description	URL	中国用户
Generate subtitles (生成字幕)	Address	点此

Links for pre-trained models

Description	URL
Speech recognition (speech to text, ASR)	Address
Text-to-speech (TTS)	Address
VAD	Address
Keyword spotting	Address
Audio tagging	Address
Speaker identification (Speaker ID)	Address
Spoken language identification (Language ID)	See multi-lingual Whisper ASR models from Speech recognition
Punctuation	Address
Speaker segmentation	Address

Some pre-trained ASR models (Streaming)

Please see

for more models. The following table lists only SOME of them.

Name	Supported Languages	Description
sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20	Chinese, English	See also
sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16	Chinese, English	See also
sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23	Chinese	Suitable for Cortex A7 CPU. See also
sherpa-onnx-streaming-zipformer-en-20M-2023-02-17	English	Suitable for Cortex A7 CPU. See also
sherpa-onnx-streaming-zipformer-korean-2024-06-16	Korean	See also
sherpa-onnx-streaming-zipformer-fr-2023-04-14	French	See also

Some pre-trained ASR models (Non-Streaming)

Please see

for more models. The following table lists only SOME of them.

Name	Supported Languages	Description
Whisper tiny.en	English	See also
Moonshine tiny	English	See also
sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17	Chinese, Cantonese, English, Korean, Japanese	支持多种中文方言. See also
sherpa-onnx-paraformer-zh-2024-03-09	Chinese, English	也支持多种中文方言. See also
sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01	Japanese	See also
sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24	Russian	See also
sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24	Russian	See also
sherpa-onnx-zipformer-ru-2024-09-18	Russian	See also
sherpa-onnx-zipformer-korean-2024-06-24	Korean	See also
sherpa-onnx-zipformer-thai-2024-06-20	Thai	See also
sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04	Chinese	支持多种方言. See also

Useful links

Documentation: https://k2-fsa.github.io/sherpa/onnx/
Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi

How to reach us

Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.

Projects using sherpa-onnx

Open-LLM-VTuber

Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms

voiceapi

Streaming ASR and TTS based on FastAPI

It shows how to use the ASR and TTS Python APIs with FastAPI.

腾讯会议摸鱼工具 TMSpeech

Uses streaming ASR in C# with graphical user interface.

Video demo in Chinese: 【开源】Windows实时字幕软件（网课/开会必备）

lol互动助手

It uses the JavaScript API of sherpa-onnx along with Electron

Video demo in Chinese: 爆了！炫神教你开打字挂！真正影响胜率的英雄联盟工具！英雄联盟的最后一块拼图！和游戏中的每个人无障碍沟通！

Languages

C++ 38.3%

Python 16.3%

Shell 7.6%

Kotlin 5.1%

JavaScript 5.1%

Other 27.4%

README.md Unescape Escape

Supported functions

Supported platforms

Supported programming languages

Introduction

Links for Huggingface Spaces

Links for pre-built Android APKs

Links for pre-built Flutter APPs

Real-time speech recognition

Text-to-speech

Links for pre-built Lazarus APPs

Generating subtitles

Links for pre-trained models

Some pre-trained ASR models (Streaming)

Some pre-trained ASR models (Non-Streaming)

Useful links

How to reach us

Projects using sherpa-onnx

Open-LLM-VTuber

voiceapi

腾讯会议摸鱼工具 TMSpeech

lol互动助手

README.md