Fangjun Kuang
562a5f7d9b
Fix building wheels for macOS ( #2192 )
2025-05-08 19:15:33 +08:00
Fangjun Kuang
f9c99032c3
Avoid NaN in feature normalization. ( #2186 )
2025-05-08 11:22:47 +08:00
Fangjun Kuang
f00066db88
Add C++ runtime for parakeet-tdt-0.6b-v2. ( #2181 )
2025-05-06 16:59:01 +08:00
Fangjun Kuang
4a7a974a04
More fix for building without tts ( #2162 )
2025-04-29 16:31:31 +08:00
Fangjun Kuang
f64c58342b
Support replacing homonphonic phrases ( #2153 )
2025-04-27 15:31:11 +08:00
Fangjun Kuang
72742d5472
Fix punctuations for kokoro tts 1.1-zh. ( #2146 )
2025-04-24 15:08:47 +08:00
Karel Vesely
6a1efd8ac2
online-transducer: reset the encoder toghter with 2 previous output symbols (non-blank) ( #2129 )
...
* online-transducer: reset the encoder toghter with 2 previous output symbols (non-blank)
- added `reset_encoder` boolean member into the OnlineRecognizerConfig class
- by default the encoder is not reset
* pybind11, adding empty symbols for disabled modules (tts, diarization)
* reset_encoder, add default value (false) [pybind11]
2025-04-24 08:18:11 +08:00
Karel Vesely
f3d23aa170
cmake build, configurable from env ( #2115 )
...
- make sure the defaults in `cmake/cmake_extension.py` variable
`extra_cmake_args` can be overriden by `cmake_args` from
`SHERPA_ONNX_CMAKE_ARGS` env variable
- fix a bug in `sherpa-onnx/csrc/parse-options.cc` which appears
when using `-DSHERPA_ONNX_ENABLE_CHECK=ON`
- avoid copying binaries when these are disabled
2025-04-16 21:26:54 +08:00
Fangjun Kuang
7a78f2eb7a
Fix building for HarmonyOS ( #2125 )
2025-04-15 18:00:07 +08:00
Fangjun Kuang
e3bce847c0
Support running sherpa-onnx with RK NPU on Android ( #2124 )
2025-04-15 16:42:28 +08:00
Askars Salimbajevs
664b461d01
Disable strict hotword matching mode for offline transducer ( #1837 )
...
* Disable strict hotword matching mode for offline transducer. Also introduces new variable, so that later this mode can be switched on in the runtime.
* remove strict mode variable
---------
Co-authored-by: Askars Salimbajevs <askars.salimbajevs@tilde.lv >
2025-04-03 22:52:19 +08:00
Askars Salimbajevs
18a6ed5ddc
Preserve more context after endpointing in transducer ( #2061 )
2025-04-02 23:33:47 +08:00
Fangjun Kuang
0de7e1b9f0
Add C++ and Python API for Dolphin CTC models ( #2085 )
2025-04-02 19:09:00 +08:00
Fangjun Kuang
1316719e23
Fix building for android ( #2081 )
2025-04-01 19:36:40 +08:00
Fangjun Kuang
a11e359c11
Refactor rknn code ( #2079 )
2025-04-01 16:54:53 +08:00
Fangjun Kuang
8e51a97550
Add C++ runtime for silero_vad with RKNN ( #2078 )
2025-04-01 15:56:56 +08:00
Fangjun Kuang
0703bc1b86
Add CXX API for VAD ( #2077 )
2025-04-01 14:51:43 +08:00
Anders Xiao
ce196fceae
fix dml with preinstall ort ( #2066 )
2025-03-30 12:07:19 +08:00
niansa/tuxifan
9d23606ee6
Allow building repository as CMake subdirectory ( #2059 )
...
* Use PROJECT_SOURCE_DIR rather than CMAKE_SOURCE_DIR to allow building as subdirectory
* Also use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR in c/cxx api examples
* Only build examples by default when not building as subdirectory
* Do not suggest building binaries either
---------
Co-authored-by: user <user@mail.tld >
2025-03-29 06:27:59 +08:00
Fangjun Kuang
a5dd0cdfc3
Fix length scale for kokoro tts ( #2060 )
2025-03-27 10:52:01 +08:00
yourengod
bd61c1d8e5
Change scale factor to 32767 ( #2056 )
2025-03-26 10:44:49 +08:00
Fangjun Kuang
823e2e6257
Fix building wheels for RKNN ( #2041 )
2025-03-22 18:33:32 +08:00
Sangeet Sagar
31096e43bd
fix static linking ( #2032 )
2025-03-21 12:47:45 +08:00
Fangjun Kuang
a19e57604e
Fix Matcha + vocos for Android ( #2024 )
2025-03-19 18:39:10 +08:00
Fangjun Kuang
a50901f366
Fix a bug in vad.reset() ( #2023 )
...
We also need to clear _last
2025-03-19 17:42:05 +08:00
Fangjun Kuang
1f52ac2126
add alsa example for vad+offline asr ( #2020 )
2025-03-18 20:06:24 +08:00
Fangjun Kuang
406272210f
Fix CI ( #2016 )
2025-03-17 22:31:36 +08:00
Fangjun Kuang
0aacf02dd8
Add C++ runtime for vocos ( #2014 )
2025-03-17 17:05:15 +08:00
Fangjun Kuang
71824992a7
Add Java API for speech enhancement GTCRN models ( #2009 )
2025-03-16 15:13:20 +08:00
Fangjun Kuang
c5dbf1177c
Add C API for speech enhancement GTCRN models ( #1984 )
2025-03-11 15:50:04 +08:00
Fangjun Kuang
5d2d792b1d
Add Python API for speech enhancement GTCRN models ( #1978 )
2025-03-10 19:02:17 +08:00
Fangjun Kuang
488a6e687c
Add C++ runtime for speech enhancement GTCRN models ( #1977 )
...
See also https://github.com/Xiaobin-Rong/gtcrn
2025-03-10 18:11:16 +08:00
cjsdurj
b87fce9a7f
c-api add wave write to buffer. ( #1962 )
...
Co-authored-by: jian.chen03 <jian.chen03@transwarp.io >
2025-03-10 17:21:23 +08:00
Fangjun Kuang
362ddf2c07
Add C++ demo for VAD+non-streaming ASR ( #1964 )
2025-03-07 11:49:46 +08:00
Karel Vesely
7740dbfb96
Ebranchformer ( #1951 )
...
* adding ebranchformer encoder
* extend surfaced FeatureExtractorConfig
- so ebranchformer feature extraction can be configured from Python
- the GlobCmvn is not needed, as it is a module in the OnnxEncoder
* clean the code
* Integrating remarks from Fangjun
2025-03-04 19:41:09 +08:00
Fangjun Kuang
209eaaae1d
Limit number of tokens per second for whisper. ( #1958 )
...
Otherwise, it spends lots of time in the loop if the EOT token
is not predicted.
2025-03-04 15:45:28 +08:00
Fangjun Kuang
c9d6859df7
Add transducer modified_beam_search for RKNN. ( #1949 )
2025-03-03 13:15:25 +08:00
Fangjun Kuang
d5e7b51af5
Support RKNN for Zipformer CTC models. ( #1948 )
2025-03-02 21:40:13 +08:00
Fangjun Kuang
dfcbc8d40b
Add Kokoro v1.1-zh ( #1942 )
2025-02-28 15:47:59 +08:00
Fangjun Kuang
337d5f7a80
Release v1.10.46 ( #1929 )
2025-02-26 19:19:33 +08:00
Fangjun Kuang
eebe19997d
Build wheels for rknn linux aarch64 ( #1928 )
2025-02-26 18:58:57 +08:00
Fangjun Kuang
82cb8a5dc3
Minor fixes for rknn ( #1925 )
2025-02-26 16:26:18 +08:00
Fangjun Kuang
4d79e6a007
Add C++ API for streaming zipformer ASR on RK NPU ( #1908 )
2025-02-24 19:07:37 +08:00
ivan provalov
94728bfbee
Fixing Whisper Model Token Normalization ( #1904 )
2025-02-21 12:58:01 +08:00
Fangjun Kuang
ed922e61b5
Fix publishing pre-built windows libraries ( #1905 )
2025-02-21 11:59:27 +08:00
Fangjun Kuang
316424b382
Add C++ and Python API for FireRedASR AED models ( #1867 )
2025-02-16 22:45:24 +08:00
Fangjun Kuang
944400e399
Fix spliting text by languages for kokoro tts. ( #1849 )
2025-02-13 18:19:34 +08:00
ahadjawaid
73d7c25233
Fix: made print sherpa_onnx_loge when it is in debug mode ( #1838 )
...
Currently, during normal use you may get a lot of print statements such as: `Use espeak-ng to handle the OOV: 'ipsum'` which may not be relevant unless you are debugging.
2025-02-11 00:22:50 +08:00
Fangjun Kuang
ad883d44fe
Support specifying voice in espeak-ng for kokoro tts models. ( #1836 )
2025-02-10 19:05:53 +08:00
Fangjun Kuang
d5da9430e8
Add PengChengStarling models to sherpa-onnx ( #1835 )
2025-02-10 18:23:40 +08:00