Commit Graph

25 Commits

Author SHA1 Message Date
Fangjun Kuang
fd9a687ec2 Add Pascal/Go/C#/Dart API for NeMo Canary ASR models (#2367)
Add support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer.

- Define Canary model config in Pascal, Go, C#, Dart and update converter functions
- Add SetConfig API for offline recognizer (Pascal, Go, C#, Dart)
- Extend CI/workflows and example scripts to test non-streaming Canary decoding
2025-07-10 14:53:33 +08:00
Fangjun Kuang
3bf986d08d Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
bda427f4b2 Add API to get version information (#2309) 2025-06-25 00:22:21 +08:00
Fangjun Kuang
6982b86c66 Support extra languages in multi-lang kokoro tts (#2303) 2025-06-20 11:22:52 +08:00
Fangjun Kuang
a6095f5f64 Fix building for Pascal (#2305) 2025-06-20 11:10:07 +08:00
Fangjun Kuang
8137ac9f0b Add Pascal API for Dolphin CTC models (#2096) 2025-04-03 16:00:22 +08:00
Fangjun Kuang
c3b009988b Add Pascal API for speech enhancement GTCRN models (#1992) 2025-03-12 10:48:59 +08:00
Fangjun Kuang
614c51068b Add Pascal API for FireRedAsr AED Model (#1877) (#1880) 2025-02-17 16:06:18 +08:00
Fangjun Kuang
69f489f0cd Support scaling the duration of a pause in TTS. (#1820) 2025-02-08 12:47:26 +08:00
Fangjun Kuang
c254504921 Add Pascal API for Kokoro TTS 1.0 (#1807) 2025-02-07 16:06:11 +08:00
Fangjun Kuang
46f2e32e8a Add Pascal API for Kokoro TTS models (#1724) 2025-01-16 18:20:21 +08:00
Fangjun Kuang
c6fcd32552 Add Pascal API for MatchaTTS models. (#1686) 2025-01-06 10:04:35 +08:00
Fangjun Kuang
cdd8e1bbcb Add Pascal API for Moonshine models (#1482) 2024-10-27 12:21:16 +08:00
Fangjun Kuang
5e273c5be4 Pascal API for speaker diarization (#1420) 2024-10-12 12:28:38 +08:00
Fangjun Kuang
e7ffcbd677 Add APIs about max speech duration in VAD for various programming languages (#1349) 2024-09-14 12:30:13 +08:00
Fangjun Kuang
544857b097 Fix building (#1343) 2024-09-13 13:33:52 +08:00
Fangjun Kuang
5a2aa110b8 Text to speech API for Object Pascal. (#1273) 2024-08-20 20:52:16 +08:00
Fangjun Kuang
e34a1a2aa3 Object pascal examples for recording and playing audio with portaudio. (#1271)
The recording example can be used for speech recognition while the playing example can be used for text to speech.

The portaudio wrapper for object pascal is copied from
https://github.com/UltraStar-Deluxe/USDX/blob/master/src/lib/portaudio/portaudio.pas
2024-08-18 19:51:08 +08:00
Fangjun Kuang
f93f0ca94d Use a separate thread to initialize models for lazarus examples. (#1270)
So that the main thread is not blocked and the user interface is responsive.
2024-08-18 14:59:48 +08:00
Fangjun Kuang
88809753ab Release v1.10.22 (#1267) 2024-08-16 22:40:49 +08:00
Fangjun Kuang
fbe35ba736 Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR (#1251) 2024-08-15 22:19:45 +08:00
Fangjun Kuang
619279b162 Pascal API for VAD (#1249) 2024-08-13 16:16:51 +08:00
Fangjun Kuang
a7dc6c2c16 Pascal API for non-streaming ASR (#1247) 2024-08-12 23:33:35 +08:00
Fangjun Kuang
5791b695ea Pascal API for streaming ASR (#1246) 2024-08-12 19:55:51 +08:00
Fangjun Kuang
65f1c0fab2 Add Pascal API for reading wave files (#1243) 2024-08-11 22:43:42 +08:00