Fangjun Kuang
6122a678f5
Refactor exporting NeMo models ( #2362 )
...
Refactors and extends model export support to include new NeMo Parakeet TDT int8 variants for English and Japanese, updating the Kotlin API, export scripts, test runners, and CI workflows.
- Added support for two new int8 model types in OfflineRecognizer.kt.
- Enhanced Python export scripts to perform dynamic quantization and metadata injection.
- Updated shell scripts and GitHub workflows to package, test, and publish int8 model artifacts.
2025-07-09 16:02:12 +08:00
Fangjun Kuang
103e93d9f6
Add Java and Kotlin API for NeMo Canary models ( #2359 )
...
Add support for the NeMo Canary model in both Java and Kotlin APIs, wiring it through
JNI and updating examples and CI.
- Introduce OfflineCanaryModelConfig in Kotlin and Java with builder patterns
- Extend OfflineRecognizer to accept and apply the new canary config via setConfig
- Update JNI binding (GetOfflineConfig) and getOfflineModelConfig mapping (type 32),
plus examples and CI workflows
2025-07-08 13:45:26 +08:00
Fangjun Kuang
3bf986d08d
Support non-streaming zipformer CTC ASR models ( #2340 )
...
This PR adds support for non-streaming Zipformer CTC ASR models across
multiple language bindings, WebAssembly, examples, and CI workflows.
- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models
Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
wenjie.Li
ef16455cb5
Add sherpa-onnx-streaming-zipformer-zh-int8-2025-06-30 to android ASR apk ( #2336 )
2025-07-03 11:31:13 +08:00
Fangjun Kuang
9fe25cc06f
Fix VAD+ASR C++ example. ( #2335 )
...
It was not able to handle short audios., e.g., 2.1 seconds.
2025-07-02 15:52:49 +08:00
Fangjun Kuang
bda427f4b2
Add API to get version information ( #2309 )
2025-06-25 00:22:21 +08:00
Fangjun Kuang
6982b86c66
Support extra languages in multi-lang kokoro tts ( #2303 )
2025-06-20 11:22:52 +08:00
Fangjun Kuang
d8bb20710d
Add script to build APK for simulated-streaming-asr. ( #2220 )
2025-05-15 15:40:22 +08:00
esavin
aeb311db50
Expose dither for JNI ( #2215 )
2025-05-14 23:38:25 +08:00
Fangjun Kuang
e537094b07
Add Kotlin and Java API for homophone replacer ( #2166 )
...
* Add Kotlin API for homonphone replacer
* Add Java API for homonphone replacer
2025-04-29 22:55:21 +08:00
Fangjun Kuang
7cbb1bc433
Upload more onnx ASR models ( #2141 )
2025-04-21 18:57:41 +08:00
Fangjun Kuang
e3bce847c0
Support running sherpa-onnx with RK NPU on Android ( #2124 )
2025-04-15 16:42:28 +08:00
Fangjun Kuang
eee5575836
Add Kotlin and Java API for Dolphin CTC models ( #2086 )
2025-04-02 21:16:14 +08:00
Fangjun Kuang
ed8e6c9aed
Add Kotlin API for speech enhancement GTCRN models ( #2008 )
2025-03-16 10:41:01 +08:00
Fangjun Kuang
f5dfcf8d2f
Add Kotlin and Java API for online punctuation models ( #1936 )
2025-02-27 16:52:36 +08:00
Fangjun Kuang
d148860d2c
Add Kotlin and Java API for FireRedAsr AED model ( #1870 )
2025-02-17 10:50:25 +08:00
Fangjun Kuang
69f489f0cd
Support scaling the duration of a pause in TTS. ( #1820 )
2025-02-08 12:47:26 +08:00
Fangjun Kuang
a52b819fb5
Add Android demo for Kokoro TTS 1.0 ( #1799 )
2025-02-07 13:07:30 +08:00
Fangjun Kuang
4372a7a7b0
Add Java and Koltin API for Kokoro TTS 1.0 ( #1798 )
2025-02-07 09:59:27 +08:00
Fangjun Kuang
8b989a851c
Fix keyword spotting. ( #1689 )
...
Reset the stream right after detecting a keyword
2025-01-20 16:41:10 +08:00
Fangjun Kuang
99cef4198b
Add Koltin and Java API for Kokoro TTS models ( #1728 )
2025-01-17 17:36:13 +08:00
Fangjun Kuang
1fe5fe495f
Add Android demo for MatchaTTS models. ( #1683 )
2025-01-06 06:44:09 +08:00
Fangjun Kuang
3422b9388d
Add Kotlin API for Matcha-TTS models. ( #1668 )
2024-12-31 19:20:52 +08:00
Fangjun Kuang
08d771337b
Add a byte-level BPE Chinese+English non-streaming zipformer model ( #1645 )
2024-12-24 16:56:49 +08:00
Fangjun Kuang
be87f866f3
Use aar in Android Java demo. ( #1616 )
2024-12-12 18:26:54 +08:00
Fangjun Kuang
4dc4f1a708
Provide sherpa-onnx.aar for Android ( #1615 )
2024-12-12 16:59:00 +08:00
Fangjun Kuang
a3c89aa0d8
Add two-pass ASR Android APKs for Moonshine models. ( #1499 )
2024-10-31 17:54:16 +08:00
Fangjun Kuang
bd4b223920
Add Kotlin and Java API for Moonshine models ( #1474 )
2024-10-26 22:30:29 +08:00
Fangjun Kuang
707cf792c5
Add GigaAM NeMo transducer model for Russian ASR ( #1467 )
2024-10-25 15:20:13 +08:00
Fangjun Kuang
b41f6d2c94
Support GigaAM CTC models for Russian ASR ( #1464 )
...
See also https://github.com/salute-developers/GigaAM
2024-10-25 10:55:16 +08:00
Fangjun Kuang
5a22f74b2b
Android demo for speaker diarization ( #1423 )
2024-10-13 14:02:57 +08:00
Fangjun Kuang
2d412b1190
Kotlin API for speaker diarization ( #1415 )
2024-10-11 14:41:53 +08:00
Fangjun Kuang
576a3aa90d
Add non-streaming ONNX models for Russian ASR ( #1358 )
2024-09-18 13:43:49 +08:00
Fangjun Kuang
e7ffcbd677
Add APIs about max speech duration in VAD for various programming languages ( #1349 )
2024-09-14 12:30:13 +08:00
Robin Zhong
d8001d6edc
update kotlin api for better release native object and add user-friendly apis. ( #1275 )
2024-08-22 19:18:11 +08:00
Robin Zhong
62c4d4ab62
Add emotion, event of SenseVoice. ( #1257 )
...
* Add emotion, event of SenseVoice.
* Fix tokens size check and update java api.
https://github.com/k2-fsa/sherpa-onnx/pull/1257
2024-08-14 15:50:13 +08:00
Fangjun Kuang
94e256244d
Add blank penalty for various language bindings. ( #1234 )
2024-08-08 10:43:31 +08:00
Fangjun Kuang
35c1b4a7a9
Add ReazonSpeech Japanese pre-trained model ( #1203 )
2024-08-02 10:21:24 +08:00
Fangjun Kuang
dd300b1de5
Add Java and Kotlin API for sense voice ( #1164 )
2024-07-22 14:08:40 +08:00
Fangjun Kuang
fa07bbc176
Add APK for small paraformer ( #1133 )
2024-07-15 19:44:36 +08:00
Fangjun Kuang
b5093e27f9
Fix publishing apks to huggingface ( #1121 )
...
Save APKs for each release in a separate directory.
Huggingface requires that each directory cannot contain more than 1000 files.
Since we have so many tts models and for each model we need to build APKs of 4 different ABIs,
it is a workaround for the huggingface's constraint by placing them into separate directories for different releases.
2024-07-13 16:14:00 +08:00
Fangjun Kuang
dd0ff2ca06
Support onnxruntime 1.18.0 ( #906 )
2024-07-10 17:05:26 +08:00
Fangjun Kuang
c2cc9dec58
Add Flush to VAD so that the last segment can be detected. ( #1099 )
2024-07-09 16:15:56 +08:00
Fangjun Kuang
8c4f576f1b
Support .Net framework 2.0 ( #1062 )
2024-06-28 11:27:19 +08:00
Fangjun Kuang
1f95bff719
Add non-streaming zipformer Android APK ( #1052 )
2024-06-24 16:22:19 +08:00
Fangjun Kuang
36336b31f4
Build Android APK for Thai ( #1036 )
2024-06-20 18:05:57 +08:00
Fangjun Kuang
6789c909d2
Inverse text normalization API of streaming ASR for various programming languages ( #1022 )
2024-06-18 13:42:17 +08:00
Fangjun Kuang
6e09933d99
Inverse text normalization API for other programming languages ( #1019 )
2024-06-17 17:02:39 +08:00
Fangjun Kuang
e1201225f2
Add Android APK for Korean ( #1015 )
2024-06-16 19:17:15 +08:00
Fangjun Kuang
fd5a0d1e00
Add C++ runtime for Tele-AI/TeleSpeech-ASR ( #970 )
2024-06-05 00:26:40 +08:00