Commit Graph

134 Commits

Author SHA1 Message Date
Askars Salimbajevs
f0960342ad Add LODR support to online and offline recognizers (#2026)
This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore.

- Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id.
- Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths.
- Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.
2025-07-09 16:23:46 +08:00
Fangjun Kuang
0e738c356c Add C++ runtime and Python API for NeMo Canary models (#2352) 2025-07-07 17:03:49 +08:00
Fangjun Kuang
bda427f4b2 Add API to get version information (#2309) 2025-06-25 00:22:21 +08:00
Fangjun Kuang
2b2788332e Add C++ support for UVR models (#2269) 2025-06-01 17:22:08 +08:00
Fangjun Kuang
716ba8317b Add C++ runtime for spleeter about source separation (#2242) 2025-05-23 22:30:57 +08:00
Fangjun Kuang
4a7a974a04 More fix for building without tts (#2162) 2025-04-29 16:31:31 +08:00
Fangjun Kuang
f64c58342b Support replacing homonphonic phrases (#2153) 2025-04-27 15:31:11 +08:00
Fangjun Kuang
0de7e1b9f0 Add C++ and Python API for Dolphin CTC models (#2085) 2025-04-02 19:09:00 +08:00
Fangjun Kuang
8e51a97550 Add C++ runtime for silero_vad with RKNN (#2078) 2025-04-01 15:56:56 +08:00
Fangjun Kuang
0703bc1b86 Add CXX API for VAD (#2077) 2025-04-01 14:51:43 +08:00
Anders Xiao
ce196fceae fix dml with preinstall ort (#2066) 2025-03-30 12:07:19 +08:00
niansa/tuxifan
9d23606ee6 Allow building repository as CMake subdirectory (#2059)
* Use PROJECT_SOURCE_DIR rather than CMAKE_SOURCE_DIR to allow building as subdirectory

* Also use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR in c/cxx api examples

* Only build examples by default when not building as subdirectory

* Do not suggest building binaries either

---------

Co-authored-by: user <user@mail.tld>
2025-03-29 06:27:59 +08:00
Sangeet Sagar
31096e43bd fix static linking (#2032) 2025-03-21 12:47:45 +08:00
Fangjun Kuang
1f52ac2126 add alsa example for vad+offline asr (#2020) 2025-03-18 20:06:24 +08:00
Fangjun Kuang
0aacf02dd8 Add C++ runtime for vocos (#2014) 2025-03-17 17:05:15 +08:00
Fangjun Kuang
488a6e687c Add C++ runtime for speech enhancement GTCRN models (#1977)
See also https://github.com/Xiaobin-Rong/gtcrn
2025-03-10 18:11:16 +08:00
Fangjun Kuang
362ddf2c07 Add C++ demo for VAD+non-streaming ASR (#1964) 2025-03-07 11:49:46 +08:00
Karel Vesely
7740dbfb96 Ebranchformer (#1951)
* adding ebranchformer encoder

* extend surfaced FeatureExtractorConfig

- so ebranchformer feature extraction can be configured from Python
- the GlobCmvn is not needed, as it is a module in the OnnxEncoder

* clean the code

* Integrating remarks from Fangjun
2025-03-04 19:41:09 +08:00
Fangjun Kuang
c9d6859df7 Add transducer modified_beam_search for RKNN. (#1949) 2025-03-03 13:15:25 +08:00
Fangjun Kuang
d5e7b51af5 Support RKNN for Zipformer CTC models. (#1948) 2025-03-02 21:40:13 +08:00
Fangjun Kuang
4d79e6a007 Add C++ API for streaming zipformer ASR on RK NPU (#1908) 2025-02-24 19:07:37 +08:00
Fangjun Kuang
316424b382 Add C++ and Python API for FireRedASR AED models (#1867) 2025-02-16 22:45:24 +08:00
Fangjun Kuang
944400e399 Fix spliting text by languages for kokoro tts. (#1849) 2025-02-13 18:19:34 +08:00
Fangjun Kuang
c84a833863 Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795) 2025-02-06 22:57:13 +08:00
Fangjun Kuang
ffc6b480a0 Add C++ and Python API for Kokoro TTS models. (#1715) 2025-01-16 14:24:51 +08:00
Fangjun Kuang
2c2926af7d Add C++ runtime for Matcha-TTS (#1627) 2024-12-31 12:44:14 +08:00
Fangjun Kuang
b6f0f5fc2e Support removing invalid utf-8 sequences. (#1648) 2024-12-25 19:32:13 +08:00
Fangjun Kuang
b76cd9033a Support decoding with byte-level BPE (bbpe) models. (#1633) 2024-12-20 19:21:32 +08:00
Fangjun Kuang
31d6206fde HarmonyOS support for VAD. (#1561) 2024-11-24 16:29:24 +08:00
Fangjun Kuang
f97daed408 Fixes #1512 (#1522) 2024-11-08 21:07:36 +08:00
Fangjun Kuang
669f5ef441 Add C++ runtime and Python APIs for Moonshine models (#1473) 2024-10-26 14:34:07 +08:00
Fangjun Kuang
b3e05f6dc4 Fix style issues (#1458) 2024-10-24 11:15:08 +08:00
Fangjun Kuang
59407edcad C++ API for speaker diarization (#1396) 2024-10-09 12:01:20 +08:00
Fangjun Kuang
70568c2df7 Support Agglomerative clustering. (#1384)
We use the open-source implementation from
https://github.com/cdalitz/hclust-cpp
2024-09-29 23:44:29 +08:00
jianyou
1414e4dc61 Add online punctuation and casing prediction model for English language (#1224) 2024-08-06 17:33:38 +08:00
Fangjun Kuang
d5f486878d Remove libonnxruntime_providers_cuda.so as a dependency. (#1210) 2024-08-03 16:25:23 +08:00
Fangjun Kuang
25f0a10468 Add C++ runtime for SenseVoice models (#1148) 2024-07-18 22:54:18 +08:00
Fangjun Kuang
960eb7529e Add C++ runtime for MeloTTS (#1138) 2024-07-16 15:55:02 +08:00
Fangjun Kuang
a25075101c Build sherpa-onnx as a single shared library (#1078)
When `-D BUILD_SHARED_LIBS=ON` is passed to `cmake`, it builds a single shared library.

Specifically, 

- For C APIs, it builds `libsherpa-onnx-c-api.so`
- For Python APIs, it builds `_sherpa_onnx.cpython-xx-xx.so`
- For Kotlin and Java APIs, it builds `libsherpa-onnx-jni.so`

There is no `libsherpa-onnx-core.so` any longer.

Note it affects only shared libraries.
2024-07-06 16:41:54 +08:00
Manix
55decb7bee Add config for TensorRT and CUDA execution provider (#992)
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>
Signed-off-by: manickavela1998@gmail.com <manickavela.arumugam@uniphore.com>
2024-07-05 15:18:37 +08:00
Fangjun Kuang
598c12c4e5 Fix CI tests (#1061) 2024-06-27 18:05:18 +08:00
Fangjun Kuang
a11c859971 Support clang-tidy (#1034) 2024-06-19 20:51:57 +08:00
Fangjun Kuang
6789c909d2 Inverse text normalization API of streaming ASR for various programming languages (#1022) 2024-06-18 13:42:17 +08:00
Fangjun Kuang
fd5a0d1e00 Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970) 2024-06-05 00:26:40 +08:00
Sangeet Sagar
3f472a9993 Add C++ runtime for *streaming* faster conformer transducer from NeMo. (#889)
Co-authored-by: sangeet2020 <15uec053@gmail.com>
2024-05-30 13:55:03 +08:00
Wei Kang
b012b78ceb Encode hotwords in C++ side (#828)
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Fangjun Kuang
46e4e5b7ac Add C++ support for streaming NeMo CTC models. (#857) 2024-05-10 16:26:43 +08:00
Fangjun Kuang
17cd3a5f01 Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854) 2024-05-10 12:15:39 +08:00
Fangjun Kuang
6b353bfb42 Add jieba for Chinese TTS models (#797) 2024-04-21 14:47:13 +08:00
Fangjun Kuang
c1608b3524 Support CED models (#792) 2024-04-19 15:20:37 +08:00