Commit Graph

59 Commits

Author SHA1 Message Date
Fangjun Kuang
3bf986d08d Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
bda427f4b2 Add API to get version information (#2309) 2025-06-25 00:22:21 +08:00
Fangjun Kuang
6982b86c66 Support extra languages in multi-lang kokoro tts (#2303) 2025-06-20 11:22:52 +08:00
Fangjun Kuang
63d01a9534 Add Swift API for homophone replacer. (#2164) 2025-04-29 18:50:41 +08:00
Fangjun Kuang
74f402e490 Add Swift API for Dolphin CTC models (#2091) 2025-04-03 00:03:11 +08:00
Fangjun Kuang
0aacf02dd8 Add C++ runtime for vocos (#2014) 2025-03-17 17:05:15 +08:00
Fangjun Kuang
c12d1d88c0 Add Swift API for speech enhancement GTCRN models (#1989) 2025-03-11 18:03:13 +08:00
Fangjun Kuang
b03f6e6e8c Add Swift API for FireRedAsr AED Model (#1876) 2025-02-17 15:16:23 +08:00
Fangjun Kuang
69f489f0cd Support scaling the duration of a pause in TTS. (#1820) 2025-02-08 12:47:26 +08:00
Fangjun Kuang
e2e0f25100 Add Swift API for Kokoro TTS 1.0 (#1803) 2025-02-07 15:06:34 +08:00
Fangjun Kuang
8b989a851c Fix keyword spotting. (#1689)
Reset the stream right after detecting a keyword
2025-01-20 16:41:10 +08:00
Fangjun Kuang
ad61ad6ff5 Add Swift API for Kokoro TTS models (#1721) 2025-01-16 16:47:37 +08:00
Fangjun Kuang
6f085babcc Add Swift API for MatchaTTS models. (#1684) 2025-01-06 07:23:45 +08:00
yujinqiu
5c2cc48f50 Add swift online punctuation (#1661) 2024-12-31 11:26:32 +08:00
Fangjun Kuang
4a4659aa4f Add Swift API for Moonshine models. (#1477) 2024-10-27 08:19:01 +08:00
Fangjun Kuang
1d061df355 WebAssembly exmaple for speaker diarization (#1411) 2024-10-10 22:14:45 +08:00
Fangjun Kuang
1571344509 Swift API for speaker diarization (#1404) 2024-10-09 23:25:39 +08:00
Fangjun Kuang
d8809b520e Fix CI errors introduced by supporting loading keywords from buffers (#1366) 2024-09-20 19:04:21 +08:00
Fangjun Kuang
73c90ec871 Fix swift example for generating subtitles. (#1362)
We need to invoke vad.flush() at the end.
2024-09-20 11:44:25 +08:00
Fangjun Kuang
e7ffcbd677 Add APIs about max speech duration in VAD for various programming languages (#1349) 2024-09-14 12:30:13 +08:00
Fangjun Kuang
5d761712db Support lang/emotion/event results from SenseVoice in Swift API. (#1346) 2024-09-13 19:43:46 +08:00
Fangjun Kuang
544857b097 Fix building (#1343) 2024-09-13 13:33:52 +08:00
Fangjun Kuang
94e256244d Add blank penalty for various language bindings. (#1234) 2024-08-08 10:43:31 +08:00
Fangjun Kuang
6422966a7f Support passing TTS callback in Swift API (#1218) 2024-08-05 14:06:21 +08:00
Fangjun Kuang
4e6aeff07e Refactor C API to prefix each API with SherpaOnnx. (#1171) 2024-07-26 18:47:02 +08:00
Fangjun Kuang
25f0a10468 Add C++ runtime for SenseVoice models (#1148) 2024-07-18 22:54:18 +08:00
Fangjun Kuang
b2c283fa2b Add Swift API for adding punctuations to text. (#1132) 2024-07-15 15:30:40 +08:00
Fangjun Kuang
d928f77d0e Add timestamps about streaming models for Swift API (#1113) 2024-07-12 17:39:46 +08:00
Fangjun Kuang
c2cc9dec58 Add Flush to VAD so that the last segment can be detected. (#1099) 2024-07-09 16:15:56 +08:00
Fangjun Kuang
ab21131f7f Swift API for keyword spotting. (#1027) 2024-06-18 16:51:30 +08:00
Fangjun Kuang
6789c909d2 Inverse text normalization API of streaming ASR for various programming languages (#1022) 2024-06-18 13:42:17 +08:00
Fangjun Kuang
6e09933d99 Inverse text normalization API for other programming languages (#1019) 2024-06-17 17:02:39 +08:00
Fangjun Kuang
fd5a0d1e00 Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970) 2024-06-05 00:26:40 +08:00
Fangjun Kuang
f8dbc10146 Fix CI (#964) 2024-06-04 17:05:49 +08:00
Daniel Breedeveld
e21caab759 fix: Typo 'maxNumSenetences' in SherpaOnnx.swift (#939) 2024-05-29 17:45:00 +08:00
Fangjun Kuang
8af2af8466 Add tail_paddings to Whisper C API. (#886) 2024-05-17 09:20:07 +08:00
Fangjun Kuang
6686c7d3e6 Add dict_dir arg to c api to support Chinese TTS models using jieba (#809) 2024-04-25 12:28:31 +08:00
Fangjun Kuang
a5f8fbc83f Support heteronyms in Chinese TTS (#738) 2024-04-08 11:01:30 +08:00
Fangjun Kuang
dbff2eaadb Add C API for streaming HLG decoding (#734) 2024-04-05 10:31:20 +08:00
Fangjun Kuang
83a10a55a5 Add Swift API for spoken language identification. (#696) 2024-03-25 16:22:25 +08:00
Fangjun Kuang
acf0975153 Support whisper language/task in various language bindings. (#679) 2024-03-20 16:43:35 +08:00
ductranminh
665b869f03 Add context biasing for mobile (#568) 2024-02-01 21:33:22 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
94ef6929bb Text-to-speech for iOS (#443) 2023-11-23 21:38:32 +08:00
Fangjun Kuang
2f22e6ed63 Add Swift API for TTS (#439) 2023-11-22 16:04:26 +08:00
yujinqiu
d01682d968 Add vad clear api for better performance (#366)
* Add vad clear api for better performance

* rename to make naming consistent and remove macro

* Fix linker error

* Fix Vad.kt
2023-10-16 14:40:47 +08:00
yujinqiu
f6566c8ace Expose VAD isDetected api to Swift (#356) 2023-10-12 15:11:58 +08:00
zr_jin
b640c295b9 Swift API for hotwords support (#331) 2023-09-21 20:32:13 +08:00
Fangjun Kuang
692a47dd80 Add Swift example for generating subtitles (#318) 2023-09-18 15:16:54 +08:00