Commit Graph

51 Commits

Author SHA1 Message Date
Fangjun Kuang
0d44df9b67 Release v1.12.5 (#2368) 2025-07-10 15:31:26 +08:00
Fangjun Kuang
df4615ca1d Add C/CXX/JavaScript API for NeMo Canary models (#2357)
This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs 
by adding new Canary configuration structures, updating bindings, extending examples,
and enhancing CI workflows.

- Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS).
- Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime.
- Update examples and CI scripts to demonstrate and test NeMo Canary model usage.
2025-07-07 23:38:04 +08:00
Fangjun Kuang
e6b388067d Release v1.12.4 (#2343) 2025-07-04 19:41:02 +08:00
Fangjun Kuang
3bf986d08d Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
e25634ac39 Release v1.12.3 (#2322) 2025-06-27 10:55:46 +08:00
Fangjun Kuang
282211c01f Remove portaudio-go in Go API examples. (#2317)
Replace the deprecated portaudio-go integration with malgo in the Go real-time 
speech recognition example and correct version string typos in the Node.js examples.

- Fixed “verison” typo in Node.js console logs.
- Swapped out portaudio-go for malgo in the Go microphone example, 
   introducing initRecognizer, callback-driven streaming, and sample conversion.
- Removed portaudio-go from go.mod.
2025-06-26 11:33:50 +08:00
Fangjun Kuang
056da0528d Release v1.12.2 (#2314) 2025-06-25 00:37:55 +08:00
Fangjun Kuang
bda427f4b2 Add API to get version information (#2309) 2025-06-25 00:22:21 +08:00
Fangjun Kuang
749dc9a239 Release v1.12.1 (#2277) 2025-06-03 21:55:49 +08:00
Fangjun Kuang
02c902a079 Release v1.12.0 (#2221) 2025-05-15 16:03:17 +08:00
Fangjun Kuang
baec2da745 Release v1.11.5 (#2187) 2025-05-08 11:39:16 +08:00
Fangjun Kuang
abc4daa49a Release v1.11.4 (#2169) 2025-05-01 11:36:44 +08:00
Fangjun Kuang
a0aef1f6cd Add JavaScript API (WASM) for homophone replacer (#2157) 2025-04-28 20:47:49 +08:00
Fangjun Kuang
31ced58f9a Release v1.11.3 (#2097) 2025-04-03 16:19:01 +08:00
Fangjun Kuang
639ad1744f Add Javascript (WebAssembly) API for Dolphin CTC models (#2093) 2025-04-03 15:02:06 +08:00
Fangjun Kuang
419f7fea0a Release v1.11.2 (#2035) 2025-03-21 14:05:57 +08:00
Fangjun Kuang
bdf84a7cf0 Release v1.11.1 (#2015) 2025-03-17 17:32:51 +08:00
Fangjun Kuang
0aacf02dd8 Add C++ runtime for vocos (#2014) 2025-03-17 17:05:15 +08:00
Fangjun Kuang
f110c776ac Release v1.11.0 (#2010) 2025-03-16 15:27:36 +08:00
Fangjun Kuang
c972554ad1 Add JavaScript API (wasm) for speech enhancement GTCRN models (#2007) 2025-03-15 17:41:23 +08:00
Fangjun Kuang
337d5f7a80 Release v1.10.46 (#1929) 2025-02-26 19:19:33 +08:00
wanghsinche
7774e35749 feat: add mic example for better compatibility (#1909)
Co-authored-by: wanghsinche <wanghsinche>
2025-02-21 21:47:21 +08:00
Fangjun Kuang
9711ab2474 Release v1.10.45 (#1881) 2025-02-17 16:20:04 +08:00
Fangjun Kuang
7ad44bc43a Add JavaScript API (WebAssembly) for FireRedAsr model. (#1874) 2025-02-17 12:54:18 +08:00
Fangjun Kuang
0610679539 Add JavaScript API (WebAssembly) for Kokoro TTS 1.0 (#1809) 2025-02-07 16:46:03 +08:00
Fangjun Kuang
8b989a851c Fix keyword spotting. (#1689)
Reset the stream right after detecting a keyword
2025-01-20 16:41:10 +08:00
Fangjun Kuang
3a1de0bfc1 Add JavaScript (WebAssembly) API for Kokoro TTS models. (#1726) 2025-01-17 11:17:18 +08:00
Fangjun Kuang
6f085babcc Add Swift API for MatchaTTS models. (#1684) 2025-01-06 07:23:45 +08:00
Fangjun Kuang
3eced3e7ee Add C# and JavaScript (wasm) API for MatchaTTS models (#1682) 2025-01-05 15:08:19 +08:00
Fangjun Kuang
6f261d39f3 Add JavaScript API for Moonshine models (#1480) 2024-10-27 11:31:01 +08:00
Fangjun Kuang
eefc172095 JavaScript API with WebAssembly for speaker diarization (#1414)
#1408 uses [node-addon-api](https://github.com/nodejs/node-addon-api) to call C API from JavaScript, whereas this pull request uses WebAssembly to call C API from JavaScript.
2024-10-11 11:40:10 +08:00
Fangjun Kuang
e7ffcbd677 Add APIs about max speech duration in VAD for various programming languages (#1349) 2024-09-14 12:30:13 +08:00
Fangjun Kuang
5ed8e31868 Add VAD and keyword spotting for the Node package with WebAssembly (#1286) 2024-08-24 23:05:54 +08:00
Fangjun Kuang
70d14353bb Add WebAssembly for SenseVoice (#1158) 2024-07-21 15:39:55 +08:00
Fangjun Kuang
dd0ff2ca06 Support onnxruntime 1.18.0 (#906) 2024-07-10 17:05:26 +08:00
Fangjun Kuang
6789c909d2 Inverse text normalization API of streaming ASR for various programming languages (#1022) 2024-06-18 13:42:17 +08:00
Fangjun Kuang
6e09933d99 Inverse text normalization API for other programming languages (#1019) 2024-06-17 17:02:39 +08:00
Fangjun Kuang
8af2af8466 Add tail_paddings to Whisper C API. (#886) 2024-05-17 09:20:07 +08:00
Fangjun Kuang
4f758e6cd3 Publish node-addon-api wrapper for sherpa-onnx as npm packages (#829) 2024-05-04 13:27:39 +08:00
Fangjun Kuang
6686c7d3e6 Add dict_dir arg to c api to support Chinese TTS models using jieba (#809) 2024-04-25 12:28:31 +08:00
Fangjun Kuang
69440e481f Add WearOS demo for audio tagging (#777) 2024-04-17 12:22:17 +08:00
Fangjun Kuang
f20291cadc Support audio tagging using zipformer (#747) 2024-04-10 14:47:06 +08:00
Fangjun Kuang
a5f8fbc83f Support heteronyms in Chinese TTS (#738) 2024-04-08 11:01:30 +08:00
Fangjun Kuang
c1c0f5bafd return timestamps for WebAssembly (#737) 2024-04-05 20:24:27 +08:00
Fangjun Kuang
dbff2eaadb Add C API for streaming HLG decoding (#734) 2024-04-05 10:31:20 +08:00
Fangjun Kuang
acf0975153 Support whisper language/task in various language bindings. (#679) 2024-03-20 16:43:35 +08:00
Fangjun Kuang
ed06ced16f Add WebAssembly for NodeJS. (#628) 2024-03-03 20:00:36 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
8dc08a9b97 Fix nodejs on Windows (#450) 2023-11-25 21:23:15 +08:00