Fangjun Kuang
3bf986d08d
Support non-streaming zipformer CTC ASR models ( #2340 )
...
This PR adds support for non-streaming Zipformer CTC ASR models across
multiple language bindings, WebAssembly, examples, and CI workflows.
- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models
Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
bda427f4b2
Add API to get version information ( #2309 )
2025-06-25 00:22:21 +08:00
Fangjun Kuang
6982b86c66
Support extra languages in multi-lang kokoro tts ( #2303 )
2025-06-20 11:22:52 +08:00
Fangjun Kuang
63d01a9534
Add Swift API for homophone replacer. ( #2164 )
2025-04-29 18:50:41 +08:00
Fangjun Kuang
74f402e490
Add Swift API for Dolphin CTC models ( #2091 )
2025-04-03 00:03:11 +08:00
Fangjun Kuang
0aacf02dd8
Add C++ runtime for vocos ( #2014 )
2025-03-17 17:05:15 +08:00
Fangjun Kuang
c12d1d88c0
Add Swift API for speech enhancement GTCRN models ( #1989 )
2025-03-11 18:03:13 +08:00
Fangjun Kuang
b03f6e6e8c
Add Swift API for FireRedAsr AED Model ( #1876 )
2025-02-17 15:16:23 +08:00
Fangjun Kuang
69f489f0cd
Support scaling the duration of a pause in TTS. ( #1820 )
2025-02-08 12:47:26 +08:00
Fangjun Kuang
e2e0f25100
Add Swift API for Kokoro TTS 1.0 ( #1803 )
2025-02-07 15:06:34 +08:00
Fangjun Kuang
8b989a851c
Fix keyword spotting. ( #1689 )
...
Reset the stream right after detecting a keyword
2025-01-20 16:41:10 +08:00
Fangjun Kuang
ad61ad6ff5
Add Swift API for Kokoro TTS models ( #1721 )
2025-01-16 16:47:37 +08:00
Fangjun Kuang
6f085babcc
Add Swift API for MatchaTTS models. ( #1684 )
2025-01-06 07:23:45 +08:00
yujinqiu
5c2cc48f50
Add swift online punctuation ( #1661 )
2024-12-31 11:26:32 +08:00
Fangjun Kuang
4a4659aa4f
Add Swift API for Moonshine models. ( #1477 )
2024-10-27 08:19:01 +08:00
Fangjun Kuang
1d061df355
WebAssembly exmaple for speaker diarization ( #1411 )
2024-10-10 22:14:45 +08:00
Fangjun Kuang
1571344509
Swift API for speaker diarization ( #1404 )
2024-10-09 23:25:39 +08:00
Fangjun Kuang
d8809b520e
Fix CI errors introduced by supporting loading keywords from buffers ( #1366 )
2024-09-20 19:04:21 +08:00
Fangjun Kuang
73c90ec871
Fix swift example for generating subtitles. ( #1362 )
...
We need to invoke vad.flush() at the end.
2024-09-20 11:44:25 +08:00
Fangjun Kuang
e7ffcbd677
Add APIs about max speech duration in VAD for various programming languages ( #1349 )
2024-09-14 12:30:13 +08:00
Fangjun Kuang
5d761712db
Support lang/emotion/event results from SenseVoice in Swift API. ( #1346 )
2024-09-13 19:43:46 +08:00
Fangjun Kuang
544857b097
Fix building ( #1343 )
2024-09-13 13:33:52 +08:00
Fangjun Kuang
94e256244d
Add blank penalty for various language bindings. ( #1234 )
2024-08-08 10:43:31 +08:00
Fangjun Kuang
6422966a7f
Support passing TTS callback in Swift API ( #1218 )
2024-08-05 14:06:21 +08:00
Fangjun Kuang
4e6aeff07e
Refactor C API to prefix each API with SherpaOnnx. ( #1171 )
2024-07-26 18:47:02 +08:00
Fangjun Kuang
25f0a10468
Add C++ runtime for SenseVoice models ( #1148 )
2024-07-18 22:54:18 +08:00
Fangjun Kuang
b2c283fa2b
Add Swift API for adding punctuations to text. ( #1132 )
2024-07-15 15:30:40 +08:00
Fangjun Kuang
d928f77d0e
Add timestamps about streaming models for Swift API ( #1113 )
2024-07-12 17:39:46 +08:00
Fangjun Kuang
c2cc9dec58
Add Flush to VAD so that the last segment can be detected. ( #1099 )
2024-07-09 16:15:56 +08:00
Fangjun Kuang
ab21131f7f
Swift API for keyword spotting. ( #1027 )
2024-06-18 16:51:30 +08:00
Fangjun Kuang
6789c909d2
Inverse text normalization API of streaming ASR for various programming languages ( #1022 )
2024-06-18 13:42:17 +08:00
Fangjun Kuang
6e09933d99
Inverse text normalization API for other programming languages ( #1019 )
2024-06-17 17:02:39 +08:00
Fangjun Kuang
fd5a0d1e00
Add C++ runtime for Tele-AI/TeleSpeech-ASR ( #970 )
2024-06-05 00:26:40 +08:00
Fangjun Kuang
f8dbc10146
Fix CI ( #964 )
2024-06-04 17:05:49 +08:00
Daniel Breedeveld
e21caab759
fix: Typo 'maxNumSenetences' in SherpaOnnx.swift ( #939 )
2024-05-29 17:45:00 +08:00
Fangjun Kuang
8af2af8466
Add tail_paddings to Whisper C API. ( #886 )
2024-05-17 09:20:07 +08:00
Fangjun Kuang
6686c7d3e6
Add dict_dir arg to c api to support Chinese TTS models using jieba ( #809 )
2024-04-25 12:28:31 +08:00
Fangjun Kuang
a5f8fbc83f
Support heteronyms in Chinese TTS ( #738 )
2024-04-08 11:01:30 +08:00
Fangjun Kuang
dbff2eaadb
Add C API for streaming HLG decoding ( #734 )
2024-04-05 10:31:20 +08:00
Fangjun Kuang
83a10a55a5
Add Swift API for spoken language identification. ( #696 )
2024-03-25 16:22:25 +08:00
Fangjun Kuang
acf0975153
Support whisper language/task in various language bindings. ( #679 )
2024-03-20 16:43:35 +08:00
ductranminh
665b869f03
Add context biasing for mobile ( #568 )
2024-02-01 21:33:22 +08:00
Fangjun Kuang
e475e750ac
Support streaming zipformer CTC ( #496 )
...
* Support streaming zipformer CTC
* test online zipformer2 CTC
* Update doc of sherpa-onnx.cc
* Add Python APIs for streaming zipformer2 ctc
* Add Python API examples for streaming zipformer2 ctc
* Swift API for streaming zipformer2 CTC
* NodeJS API for streaming zipformer2 CTC
* Kotlin API for streaming zipformer2 CTC
* Golang API for streaming zipformer2 CTC
* C# API for streaming zipformer2 CTC
* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
62dc3c3e46
Use piper-phonemize to convert text to token IDs ( #453 )
2023-11-30 23:57:43 +08:00
Fangjun Kuang
94ef6929bb
Text-to-speech for iOS ( #443 )
2023-11-23 21:38:32 +08:00
Fangjun Kuang
2f22e6ed63
Add Swift API for TTS ( #439 )
2023-11-22 16:04:26 +08:00
yujinqiu
d01682d968
Add vad clear api for better performance ( #366 )
...
* Add vad clear api for better performance
* rename to make naming consistent and remove macro
* Fix linker error
* Fix Vad.kt
2023-10-16 14:40:47 +08:00
yujinqiu
f6566c8ace
Expose VAD isDetected api to Swift ( #356 )
2023-10-12 15:11:58 +08:00
zr_jin
b640c295b9
Swift API for hotwords support ( #331 )
2023-09-21 20:32:13 +08:00
Fangjun Kuang
692a47dd80
Add Swift example for generating subtitles ( #318 )
2023-09-18 15:16:54 +08:00