Commit Graph

22 Commits

Author SHA1 Message Date
Fangjun Kuang
3bf986d08d Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
a0aef1f6cd Add JavaScript API (WASM) for homophone replacer (#2157) 2025-04-28 20:47:49 +08:00
Fangjun Kuang
639ad1744f Add Javascript (WebAssembly) API for Dolphin CTC models (#2093) 2025-04-03 15:02:06 +08:00
Fangjun Kuang
0aacf02dd8 Add C++ runtime for vocos (#2014) 2025-03-17 17:05:15 +08:00
Fangjun Kuang
c972554ad1 Add JavaScript API (wasm) for speech enhancement GTCRN models (#2007) 2025-03-15 17:41:23 +08:00
Fangjun Kuang
0610679539 Add JavaScript API (WebAssembly) for Kokoro TTS 1.0 (#1809) 2025-02-07 16:46:03 +08:00
Fangjun Kuang
3a1de0bfc1 Add JavaScript (WebAssembly) API for Kokoro TTS models. (#1726) 2025-01-17 11:17:18 +08:00
Fangjun Kuang
3eced3e7ee Add C# and JavaScript (wasm) API for MatchaTTS models (#1682) 2025-01-05 15:08:19 +08:00
Fangjun Kuang
6f261d39f3 Add JavaScript API for Moonshine models (#1480) 2024-10-27 11:31:01 +08:00
Fangjun Kuang
eefc172095 JavaScript API with WebAssembly for speaker diarization (#1414)
#1408 uses [node-addon-api](https://github.com/nodejs/node-addon-api) to call C API from JavaScript, whereas this pull request uses WebAssembly to call C API from JavaScript.
2024-10-11 11:40:10 +08:00
Fangjun Kuang
5ed8e31868 Add VAD and keyword spotting for the Node package with WebAssembly (#1286) 2024-08-24 23:05:54 +08:00
Fangjun Kuang
70d14353bb Add WebAssembly for SenseVoice (#1158) 2024-07-21 15:39:55 +08:00
Fangjun Kuang
dd0ff2ca06 Support onnxruntime 1.18.0 (#906) 2024-07-10 17:05:26 +08:00
Fangjun Kuang
6789c909d2 Inverse text normalization API of streaming ASR for various programming languages (#1022) 2024-06-18 13:42:17 +08:00
Fangjun Kuang
6e09933d99 Inverse text normalization API for other programming languages (#1019) 2024-06-17 17:02:39 +08:00
Fangjun Kuang
6fb8ceda57 Add VAD examples using ALSA for recording (#739) 2024-04-08 16:41:01 +08:00
Fangjun Kuang
a5f8fbc83f Support heteronyms in Chinese TTS (#738) 2024-04-08 11:01:30 +08:00
Fangjun Kuang
dbff2eaadb Add C API for streaming HLG decoding (#734) 2024-04-05 10:31:20 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
8dc08a9b97 Fix nodejs on Windows (#450) 2023-11-25 21:23:15 +08:00
Fangjun Kuang
fe977b8e8e support nodejs (#438) 2023-11-21 23:20:08 +08:00