Commit Graph

34 Commits

Author SHA1 Message Date
Askars Salimbajevs
f0960342ad Add LODR support to online and offline recognizers (#2026)
This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore.

- Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id.
- Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths.
- Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.
2025-07-09 16:23:46 +08:00
Fangjun Kuang
0e738c356c Add C++ runtime and Python API for NeMo Canary models (#2352) 2025-07-07 17:03:49 +08:00
Fangjun Kuang
3bf986d08d Support non-streaming zipformer CTC ASR models (#2340)
This PR adds support for non-streaming Zipformer CTC ASR models across 
multiple language bindings, WebAssembly, examples, and CI workflows.

- Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs
- Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js
- Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models

Model doc is available at
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html
2025-07-04 15:57:07 +08:00
Fangjun Kuang
f64c58342b Support replacing homonphonic phrases (#2153) 2025-04-27 15:31:11 +08:00
Nickolay V. Shmyrev
84ed5d4288 Expose dither in python API (#2127) 2025-04-17 16:47:48 +08:00
Fangjun Kuang
0de7e1b9f0 Add C++ and Python API for Dolphin CTC models (#2085) 2025-04-02 19:09:00 +08:00
Fangjun Kuang
316424b382 Add C++ and Python API for FireRedASR AED models (#1867) 2025-02-16 22:45:24 +08:00
Fangjun Kuang
669f5ef441 Add C++ runtime and Python APIs for Moonshine models (#1473) 2024-10-26 14:34:07 +08:00
xsjk
1da75ee3c0 Fix typo in offline-lm-config.cc (#1229) 2024-08-07 15:38:34 +08:00
Fangjun Kuang
25f0a10468 Add C++ runtime for SenseVoice models (#1148) 2024-07-18 22:54:18 +08:00
SilverSulfide
656b9fa1c8 Add Python API support for Offline LM rescoring (#1033) 2024-06-19 16:29:37 +08:00
Fangjun Kuang
b0f7ed3ee3 Add inverse text normalization for non-streaming ASR (#1017) 2024-06-17 14:28:53 +08:00
Fangjun Kuang
fd5a0d1e00 Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970) 2024-06-05 00:26:40 +08:00
Wei Kang
b012b78ceb Encode hotwords in C++ side (#828)
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Fangjun Kuang
17cd3a5f01 Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854) 2024-05-10 12:15:39 +08:00
Bhaswati Saha
fda614d0d1 beam search value as parameter in offline_recognizer.py (#673)
Co-authored-by: bhascns <bhaswati@mihup.com>
2024-03-18 18:43:05 +08:00
chiiyeh
466a6855c8 add hotwords docstring to offline_recognizer and online_recognizer (#546) 2024-01-25 16:54:20 +08:00
chiiyeh
3bb3849ec5 add blank_penalty for offline transducer (#542) 2024-01-25 15:00:09 +08:00
Fangjun Kuang
0e23f82691 Give an informative log for whisper on exceptions. (#473) 2023-12-08 14:33:59 +08:00
Fangjun Kuang
049fb9f451 Add Python APIs for WeNet CTC models (#428) 2023-11-16 14:20:41 +08:00
Fangjun Kuang
407602445d Add CTC HLG decoding using OpenFst (#349) 2023-10-08 11:32:39 +08:00
Fangjun Kuang
33a5765169 Print a more user-friendly error message when using --hotwords-file. (#344) 2023-09-26 11:04:20 +08:00
Wei Kang
47184f9db7 Refactor hotwords,support loading hotwords from file (#296) 2023-09-14 19:33:17 +08:00
Fangjun Kuang
f709c95c5f Support multilingual whisper models (#274) 2023-08-16 00:28:52 +08:00
Fangjun Kuang
a4bff28e21 Support TDNN models from the yesno recipe from icefall (#262) 2023-08-12 19:50:22 +08:00
Fangjun Kuang
b094868fb8 Add non-streaming websocket server for python (#259) 2023-08-11 15:56:24 +08:00
Fangjun Kuang
45b9d4ab37 Support whisper models (#238) 2023-08-07 12:34:18 +08:00
Fangjun Kuang
f3206c49dc Reduce model initialization time for offline speech recognition (#213) 2023-07-14 18:07:27 +08:00
Fangjun Kuang
33bf8dc1f4 Support specifying providers in Python API (#198) 2023-07-06 10:14:01 +08:00
Wei Kang
8562711252 Implement context biasing with a Aho Corasick automata (#145)
* Implement context graph

* Modify the interface to support context biasing

* Support context biasing in modified beam search; add python wrapper

* Support context biasing in python api example

* Minor fixes

* Fix context graph

* Minor fixes

* Fix tests

* Fix style

* Fix style

* Fix comments

* Minor fixes

* Add missing header

* Replace std::shared_ptr with std::unique_ptr for effciency

* Build graph in constructor

* Fix comments

* Minor fixes

* Fix docs
2023-06-16 14:26:36 +08:00
Fangjun Kuang
cea718e3d8 Support CoreML for macOS (#151) 2023-05-12 15:57:44 +08:00
Fangjun Kuang
80060c276d Begin to support CTC models (#119)
Please see https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/index.html for a list of pre-trained CTC models from NeMo.
2023-04-07 23:11:34 +08:00
Fangjun Kuang
5d3c8edbc9 add python tests (#111) 2023-04-02 23:05:30 +08:00
manyeyes
3f7e0c23ac adding a python api for offline decode (#110) 2023-04-02 13:17:43 +08:00