Commit Graph

94 Commits

Author SHA1 Message Date
Fangjun Kuang
103e93d9f6 Add Java and Kotlin API for NeMo Canary models (#2359)
Add support for the NeMo Canary model in both Java and Kotlin APIs, wiring it through
JNI and updating examples and CI.

- Introduce OfflineCanaryModelConfig in Kotlin and Java with builder patterns
- Extend OfflineRecognizer to accept and apply the new canary config via setConfig
- Update JNI binding (GetOfflineConfig) and getOfflineModelConfig mapping (type 32), 
   plus examples and CI workflows
2025-07-08 13:45:26 +08:00
Fangjun Kuang
3bf986d08d Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
bda427f4b2 Add API to get version information (#2309) 2025-06-25 00:22:21 +08:00
Fangjun Kuang
6982b86c66 Support extra languages in multi-lang kokoro tts (#2303) 2025-06-20 11:22:52 +08:00
Fangjun Kuang
2b2788332e Add C++ support for UVR models (#2269) 2025-06-01 17:22:08 +08:00
Fangjun Kuang
ff6f3b17ac Use jlong explicitly in jni. (#2229) 2025-05-20 15:29:47 +08:00
esavin
aeb311db50 Expose dither for JNI (#2215) 2025-05-14 23:38:25 +08:00
Fangjun Kuang
2e9e0b4e9e Add Android demo for real-time ASR with non-streaming ASR models. (#2214) 2025-05-14 19:10:44 +08:00
Fangjun Kuang
e537094b07 Add Kotlin and Java API for homophone replacer (#2166)
* Add Kotlin API for homonphone replacer

* Add Java API for homonphone replacer
2025-04-29 22:55:21 +08:00
Fangjun Kuang
e3280027f9 Support decoding multiple streams in Java API. (#2149) 2025-04-25 11:18:57 +08:00
Fangjun Kuang
1c3a383002 Fix a typo in the JNI for Android. (#2108) 2025-04-09 09:02:41 +08:00
Fangjun Kuang
eee5575836 Add Kotlin and Java API for Dolphin CTC models (#2086) 2025-04-02 21:16:14 +08:00
niansa/tuxifan
9d23606ee6 Allow building repository as CMake subdirectory (#2059)
* Use PROJECT_SOURCE_DIR rather than CMAKE_SOURCE_DIR to allow building as subdirectory

* Also use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR in c/cxx api examples

* Only build examples by default when not building as subdirectory

* Do not suggest building binaries either

---------

Co-authored-by: user <user@mail.tld>
2025-03-29 06:27:59 +08:00
Fangjun Kuang
ed8e6c9aed Add Kotlin API for speech enhancement GTCRN models (#2008) 2025-03-16 10:41:01 +08:00
Fangjun Kuang
f5dfcf8d2f Add Kotlin and Java API for online punctuation models (#1936) 2025-02-27 16:52:36 +08:00
Fangjun Kuang
337d5f7a80 Release v1.10.46 (#1929) 2025-02-26 19:19:33 +08:00
ivan provalov
4801094133 JNI Exception Handling (#1452) 2025-02-19 23:02:28 +08:00
Fangjun Kuang
d148860d2c Add Kotlin and Java API for FireRedAsr AED model (#1870) 2025-02-17 10:50:25 +08:00
Fangjun Kuang
69f489f0cd Support scaling the duration of a pause in TTS. (#1820) 2025-02-08 12:47:26 +08:00
Fangjun Kuang
4372a7a7b0 Add Java and Koltin API for Kokoro TTS 1.0 (#1798) 2025-02-07 09:59:27 +08:00
Fangjun Kuang
c84a833863 Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795) 2025-02-06 22:57:13 +08:00
Fangjun Kuang
8b989a851c Fix keyword spotting. (#1689)
Reset the stream right after detecting a keyword
2025-01-20 16:41:10 +08:00
Fangjun Kuang
99cef4198b Add Koltin and Java API for Kokoro TTS models (#1728) 2025-01-17 17:36:13 +08:00
Fangjun Kuang
3422b9388d Add Kotlin API for Matcha-TTS models. (#1668) 2024-12-31 19:20:52 +08:00
Fangjun Kuang
e639c70d78 Support linking onnxruntime statically for Android (#1619) 2024-12-14 09:53:44 +08:00
Fangjun Kuang
bd4b223920 Add Kotlin and Java API for Moonshine models (#1474) 2024-10-26 22:30:29 +08:00
Fangjun Kuang
94b26ff07c Android JNI support for speaker diarization (#1421) 2024-10-12 13:03:48 +08:00
Fangjun Kuang
1ed803adc1 Dart API for speaker diarization (#1418) 2024-10-11 21:17:41 +08:00
Fangjun Kuang
2d412b1190 Kotlin API for speaker diarization (#1415) 2024-10-11 14:41:53 +08:00
Fangjun Kuang
e7ffcbd677 Add APIs about max speech duration in VAD for various programming languages (#1349) 2024-09-14 12:30:13 +08:00
RGdevz
1f29e4a1a9 throw error instead exit (#1323) 2024-09-06 09:59:21 +08:00
Fangjun Kuang
ca729faebf Support reading multi-channel wave files with 8/16/32-bit encoded samples (#1258) 2024-08-15 14:54:43 +08:00
Robin Zhong
62c4d4ab62 Add emotion, event of SenseVoice. (#1257)
* Add emotion, event of SenseVoice.

* Fix tokens size check and update java api.

https://github.com/k2-fsa/sherpa-onnx/pull/1257
2024-08-14 15:50:13 +08:00
ivan provalov
9f06b059d7 Update offline-recognizer.cc (#1253)
Adding setConfig method to JNI to support setting a config on the previously initialized offline-recognizer.
2024-08-13 23:04:51 +08:00
Fangjun Kuang
94e256244d Add blank penalty for various language bindings. (#1234) 2024-08-08 10:43:31 +08:00
Fangjun Kuang
dd300b1de5 Add Java and Kotlin API for sense voice (#1164) 2024-07-22 14:08:40 +08:00
Fangjun Kuang
c2cc9dec58 Add Flush to VAD so that the last segment can be detected. (#1099) 2024-07-09 16:15:56 +08:00
Fangjun Kuang
a25075101c Build sherpa-onnx as a single shared library (#1078)
When `-D BUILD_SHARED_LIBS=ON` is passed to `cmake`, it builds a single shared library.

Specifically, 

- For C APIs, it builds `libsherpa-onnx-c-api.so`
- For Python APIs, it builds `_sherpa_onnx.cpython-xx-xx.so`
- For Kotlin and Java APIs, it builds `libsherpa-onnx-jni.so`

There is no `libsherpa-onnx-core.so` any longer.

Note it affects only shared libraries.
2024-07-06 16:41:54 +08:00
Manix
55decb7bee Add config for TensorRT and CUDA execution provider (#992)
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>
Signed-off-by: manickavela1998@gmail.com <manickavela.arumugam@uniphore.com>
2024-07-05 15:18:37 +08:00
Fangjun Kuang
2f8c489698 Publish pre-built jni libs for windows and osx (#1056) 2024-06-25 11:59:04 +08:00
Fangjun Kuang
9dd0e03568 Enable to stop TTS generation (#1041) 2024-06-22 18:18:36 +08:00
Fangjun Kuang
6789c909d2 Inverse text normalization API of streaming ASR for various programming languages (#1022) 2024-06-18 13:42:17 +08:00
Fangjun Kuang
6e09933d99 Inverse text normalization API for other programming languages (#1019) 2024-06-17 17:02:39 +08:00
Fangjun Kuang
fd5a0d1e00 Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970) 2024-06-05 00:26:40 +08:00
Fangjun Kuang
f1cff83ef9 Add address sanitizer and undefined behavior sanitizer (#951) 2024-05-31 13:17:01 +08:00
Fangjun Kuang
bcaa6df389 Add VAD demo for Java API (#928) 2024-05-28 14:59:47 +08:00
Wei Kang
b012b78ceb Encode hotwords in C++ side (#828)
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Fangjun Kuang
65635b09d8 Fix a typo in jni (#885) 2024-05-16 14:31:45 +08:00
linziguan
d2745698c5 Support building JNI on Windows (#881) 2024-05-16 06:25:53 +08:00
Fangjun Kuang
db85b2c1d8 Add Android APKs for NeMo CTC models. (#866) 2024-05-12 14:58:36 +08:00