This repository has been archived on 2025-08-26. You can view files and clone it, but cannot push or open issues or pull requests.
Files
enginex-mr_series-sherpa-onnx/java-api-examples
Fangjun Kuang 3bf986d08d Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
..

Introduction

This directory contains examples for the JAVA API of sherpa-onnx.

Usage

Non-streaming speaker diarization

./run-offline-speaker-diarization.sh

Streaming Speech recognition

./run-streaming-asr-from-mic-transducer.sh
./run-streaming-decode-file-ctc.sh
./run-streaming-decode-file-ctc-hlg.sh
./run-streaming-decode-file-paraformer.sh
./run-streaming-decode-file-transducer.sh

Non-Streaming Speech recognition

./run-non-streaming-decode-file-dolphin-ctc.sh
./run-non-streaming-decode-file-paraformer.sh
./run-non-streaming-decode-file-sense-voice.sh
./run-non-streaming-decode-file-transducer.sh
./run-non-streaming-decode-file-whisper.sh
./run-non-streaming-decode-file-nemo.sh

Non-Streaming Speech recognition with homophone replacer

./run-non-streaming-decode-file-sense-voice-with-hr.sh

Non-Streaming text-to-speech

./run-non-streaming-tts-piper-en.sh
./run-non-streaming-tts-coqui-de.sh
./run-non-streaming-tts-vits-zh.sh

Non-Streaming text-to-speech (Play as it is generating)

./run-non-streaming-tts-piper-en-with-callback.sh

Spoken language identification

./run-spoken-language-identification-whisper.sh

Add punctuations to text

The punctuation model supports both English and Chinese.

./run-add-punctuation-zh-en.sh

Audio tagging

./run-audio-tagging-zipformer-from-file.sh
./run-audio-tagging-ced-from-file.sh

Speaker identification

./run-speaker-identification.sh

VAD with a microphone

./run-vad-from-mic.sh

VAD with a microphone + Non-streaming SenseVoice for speech recognition

./run-vad-from-mic-non-streaming-sense-voice.sh

VAD with a microphone + Non-streaming Paraformer for speech recognition

./run-vad-from-mic-non-streaming-paraformer.sh

VAD with a microphone + Non-streaming Whisper tiny.en for speech recognition

./run-vad-from-mic-non-streaming-whisper.sh

VAD (Remove silence)

./run-vad-remove-slience.sh

VAD + Non-streaming Dolphin CTC for speech recognition

./run-vad-non-streaming-dolphin-ctc.sh

VAD + Non-streaming SenseVoice for speech recognition

./run-vad-non-streaming-sense-voice.sh

VAD + Non-streaming Paraformer for speech recognition

./run-vad-non-streaming-paraformer.sh

Keyword spotter

./run-kws-from-file.sh