Commit Graph

620 Commits

Author SHA1 Message Date
niansa/tuxifan
9d23606ee6 Allow building repository as CMake subdirectory (#2059)
* Use PROJECT_SOURCE_DIR rather than CMAKE_SOURCE_DIR to allow building as subdirectory

* Also use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR in c/cxx api examples

* Only build examples by default when not building as subdirectory

* Do not suggest building binaries either

---------

Co-authored-by: user <user@mail.tld>
2025-03-29 06:27:59 +08:00
Fangjun Kuang
a5dd0cdfc3 Fix length scale for kokoro tts (#2060) 2025-03-27 10:52:01 +08:00
yourengod
bd61c1d8e5 Change scale factor to 32767 (#2056) 2025-03-26 10:44:49 +08:00
Fangjun Kuang
823e2e6257 Fix building wheels for RKNN (#2041) 2025-03-22 18:33:32 +08:00
Sangeet Sagar
31096e43bd fix static linking (#2032) 2025-03-21 12:47:45 +08:00
Fangjun Kuang
a19e57604e Fix Matcha + vocos for Android (#2024) 2025-03-19 18:39:10 +08:00
Fangjun Kuang
a50901f366 Fix a bug in vad.reset() (#2023)
We also need to clear _last
2025-03-19 17:42:05 +08:00
Fangjun Kuang
1f52ac2126 add alsa example for vad+offline asr (#2020) 2025-03-18 20:06:24 +08:00
Fangjun Kuang
406272210f Fix CI (#2016) 2025-03-17 22:31:36 +08:00
Fangjun Kuang
0aacf02dd8 Add C++ runtime for vocos (#2014) 2025-03-17 17:05:15 +08:00
Fangjun Kuang
71824992a7 Add Java API for speech enhancement GTCRN models (#2009) 2025-03-16 15:13:20 +08:00
Fangjun Kuang
ed8e6c9aed Add Kotlin API for speech enhancement GTCRN models (#2008) 2025-03-16 10:41:01 +08:00
Fangjun Kuang
6a97f8adcf Add JavaScript (node-addon) API for speech enhancement GTCRN models (#1996) 2025-03-12 15:52:01 +08:00
Fangjun Kuang
c3b009988b Add Pascal API for speech enhancement GTCRN models (#1992) 2025-03-12 10:48:59 +08:00
Fangjun Kuang
802119db17 Add CXX API for speech enhancement GTCRN models (#1986) 2025-03-11 17:07:52 +08:00
Fangjun Kuang
c5dbf1177c Add C API for speech enhancement GTCRN models (#1984) 2025-03-11 15:50:04 +08:00
Fangjun Kuang
5d2d792b1d Add Python API for speech enhancement GTCRN models (#1978) 2025-03-10 19:02:17 +08:00
Fangjun Kuang
488a6e687c Add C++ runtime for speech enhancement GTCRN models (#1977)
See also https://github.com/Xiaobin-Rong/gtcrn
2025-03-10 18:11:16 +08:00
cjsdurj
b87fce9a7f c-api add wave write to buffer. (#1962)
Co-authored-by: jian.chen03 <jian.chen03@transwarp.io>
2025-03-10 17:21:23 +08:00
Fangjun Kuang
362ddf2c07 Add C++ demo for VAD+non-streaming ASR (#1964) 2025-03-07 11:49:46 +08:00
Karel Vesely
7740dbfb96 Ebranchformer (#1951)
* adding ebranchformer encoder

* extend surfaced FeatureExtractorConfig

- so ebranchformer feature extraction can be configured from Python
- the GlobCmvn is not needed, as it is a module in the OnnxEncoder

* clean the code

* Integrating remarks from Fangjun
2025-03-04 19:41:09 +08:00
Fangjun Kuang
209eaaae1d Limit number of tokens per second for whisper. (#1958)
Otherwise, it spends lots of time in the loop if the EOT token
is not predicted.
2025-03-04 15:45:28 +08:00
Fangjun Kuang
c9d6859df7 Add transducer modified_beam_search for RKNN. (#1949) 2025-03-03 13:15:25 +08:00
Fangjun Kuang
d5e7b51af5 Support RKNN for Zipformer CTC models. (#1948) 2025-03-02 21:40:13 +08:00
Fangjun Kuang
dfcbc8d40b Add Kokoro v1.1-zh (#1942) 2025-02-28 15:47:59 +08:00
Fangjun Kuang
f5dfcf8d2f Add Kotlin and Java API for online punctuation models (#1936) 2025-02-27 16:52:36 +08:00
Fangjun Kuang
337d5f7a80 Release v1.10.46 (#1929) 2025-02-26 19:19:33 +08:00
Fangjun Kuang
eebe19997d Build wheels for rknn linux aarch64 (#1928) 2025-02-26 18:58:57 +08:00
Fangjun Kuang
82cb8a5dc3 Minor fixes for rknn (#1925) 2025-02-26 16:26:18 +08:00
Fangjun Kuang
4d79e6a007 Add C++ API for streaming zipformer ASR on RK NPU (#1908) 2025-02-24 19:07:37 +08:00
ivan provalov
94728bfbee Fixing Whisper Model Token Normalization (#1904) 2025-02-21 12:58:01 +08:00
Fangjun Kuang
ed922e61b5 Fix publishing pre-built windows libraries (#1905) 2025-02-21 11:59:27 +08:00
ivan provalov
4801094133 JNI Exception Handling (#1452) 2025-02-19 23:02:28 +08:00
Fangjun Kuang
614c51068b Add Pascal API for FireRedAsr AED Model (#1877) (#1880) 2025-02-17 16:06:18 +08:00
Fangjun Kuang
1d49dd2fb0 Add CXX API for FireRedAsr (#1872) 2025-02-17 11:46:13 +08:00
Fangjun Kuang
193d31333c Add C API for FireRedAsr AED model. (#1871) 2025-02-17 11:22:17 +08:00
Fangjun Kuang
d148860d2c Add Kotlin and Java API for FireRedAsr AED model (#1870) 2025-02-17 10:50:25 +08:00
Fangjun Kuang
316424b382 Add C++ and Python API for FireRedASR AED models (#1867) 2025-02-16 22:45:24 +08:00
Fangjun Kuang
944400e399 Fix spliting text by languages for kokoro tts. (#1849) 2025-02-13 18:19:34 +08:00
ahadjawaid
73d7c25233 Fix: made print sherpa_onnx_loge when it is in debug mode (#1838)
Currently, during normal use you may get a lot of print statements such as: `Use espeak-ng to handle the OOV: 'ipsum'` which may not be relevant unless you are debugging.
2025-02-11 00:22:50 +08:00
Fangjun Kuang
ad883d44fe Support specifying voice in espeak-ng for kokoro tts models. (#1836) 2025-02-10 19:05:53 +08:00
Fangjun Kuang
d5da9430e8 Add PengChengStarling models to sherpa-onnx (#1835) 2025-02-10 18:23:40 +08:00
Kell
2ac41d3d85 OfflineRecognizer supports create stream with hotwords (#1833)
Co-authored-by: Wangkai <kell.wang@huawei.com>
2025-02-10 16:26:56 +08:00
Fangjun Kuang
9559a10bd3 Add C++ support for MatchaTTS models not from icefall. (#1834) 2025-02-10 15:38:29 +08:00
Fangjun Kuang
69f489f0cd Support scaling the duration of a pause in TTS. (#1820) 2025-02-08 12:47:26 +08:00
Fangjun Kuang
d38cb81014 Fix passing gb2312 encoded strings to tts on Windows (#1819) 2025-02-08 09:48:58 +08:00
Fangjun Kuang
c254504921 Add Pascal API for Kokoro TTS 1.0 (#1807) 2025-02-07 16:06:11 +08:00
Fangjun Kuang
d815204774 Add CXX API for Kokoro TTS 1.0 (#1802) 2025-02-07 14:51:49 +08:00
Fangjun Kuang
7330f7519a Add C API for Kokoro TTS 1.0 (#1801) 2025-02-07 14:30:40 +08:00
Fangjun Kuang
a52b819fb5 Add Android demo for Kokoro TTS 1.0 (#1799) 2025-02-07 13:07:30 +08:00