Fangjun Kuang
f93f0ca94d
Use a separate thread to initialize models for lazarus examples. ( #1270 )
...
So that the main thread is not blocked and the user interface is responsive.
2024-08-18 14:59:48 +08:00
Fangjun Kuang
9dcea49dba
Fix looking up OOVs in lexicon.txt for MeloTTS models. ( #1266 )
...
If an English word does not exist in the lexicon, we split
it into characters. For instance, if the word TTS does not
exist in lexicon.txt, we split it into 3 characters T, T, and S.
2024-08-16 22:10:03 +08:00
Ikko Eltociear Ashimine
a3e98750e9
chore: update online-stream.h ( #1264 )
...
Fix typos.
2024-08-16 15:17:15 +08:00
Fangjun Kuang
fbe35ba736
Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR ( #1251 )
2024-08-15 22:19:45 +08:00
Fangjun Kuang
ca729faebf
Support reading multi-channel wave files with 8/16/32-bit encoded samples ( #1258 )
2024-08-15 14:54:43 +08:00
Robin Zhong
62c4d4ab62
Add emotion, event of SenseVoice. ( #1257 )
...
* Add emotion, event of SenseVoice.
* Fix tokens size check and update java api.
https://github.com/k2-fsa/sherpa-onnx/pull/1257
2024-08-14 15:50:13 +08:00
Fangjun Kuang
619279b162
Pascal API for VAD ( #1249 )
2024-08-13 16:16:51 +08:00
Fangjun Kuang
8a5f5c1999
Fix python two pass ASR examples ( #1230 )
2024-08-07 18:35:38 +08:00
Fangjun Kuang
375c055ff8
Fix style issues for online punctuation source files ( #1225 )
2024-08-06 17:43:24 +08:00
jianyou
1414e4dc61
Add online punctuation and casing prediction model for English language ( #1224 )
2024-08-06 17:33:38 +08:00
Fangjun Kuang
9caa488019
Fix setting SenseVoice language. ( #1214 )
2024-08-04 19:02:23 +08:00
Fangjun Kuang
d5f486878d
Remove libonnxruntime_providers_cuda.so as a dependency. ( #1210 )
2024-08-03 16:25:23 +08:00
Fangjun Kuang
53484fcd9b
Fix reading non-standard wav files. ( #1199 )
2024-08-01 17:48:04 +08:00
Fangjun Kuang
86b4c9f535
Fix splitting sentences for MeloTTS ( #1186 )
2024-07-29 17:04:45 +08:00
Fangjun Kuang
994c3e7c96
Add VAD + Non-streaming ASR example for JavaScript API. ( #1170 )
2024-07-26 12:42:08 +08:00
Fangjun Kuang
299f1a852b
Fix style issues reported by clang-tidy ( #1167 )
2024-07-23 09:26:36 +08:00
thewh1teagle
d32a46169f
feat: add directml support ( #1153 )
2024-07-22 23:50:48 +08:00
Fangjun Kuang
1a471595a5
Fix Android build ( #1161 )
2024-07-22 09:27:30 +08:00
Fangjun Kuang
ffdb23a8ec
Add dart API for SenseVoice ( #1159 )
2024-07-21 21:48:12 +08:00
Fangjun Kuang
25f0a10468
Add C++ runtime for SenseVoice models ( #1148 )
2024-07-18 22:54:18 +08:00
Wei Kang
5b1fa8750f
Fix hotwords OOV log ( #1139 )
2024-07-16 19:41:31 +08:00
Fangjun Kuang
960eb7529e
Add C++ runtime for MeloTTS ( #1138 )
2024-07-16 15:55:02 +08:00
Manickavela
11cfd33b10
encoder only trt ep for transducer ( #1130 )
2024-07-15 14:52:33 +08:00
ivan provalov
de04b3b9bf
Allow modify model config at decode time for ASR ( #1124 )
2024-07-13 22:30:47 +08:00
Fangjun Kuang
b5093e27f9
Fix publishing apks to huggingface ( #1121 )
...
Save APKs for each release in a separate directory.
Huggingface requires that each directory cannot contain more than 1000 files.
Since we have so many tts models and for each model we need to build APKs of 4 different ABIs,
it is a workaround for the huggingface's constraint by placing them into separate directories for different releases.
2024-07-13 16:14:00 +08:00
Fangjun Kuang
117cd7bb8c
Support whisper large/large-v1/large-v2/large-v3 and distil-large-v2 ( #1114 )
2024-07-12 23:47:39 +08:00
thewh1teagle
c0eaf86dbd
feat: find best embedding matches ( #1102 )
2024-07-11 09:38:06 +08:00
Fangjun Kuang
c2cc9dec58
Add Flush to VAD so that the last segment can be detected. ( #1099 )
2024-07-09 16:15:56 +08:00
Manix
3e4307e2fb
updating trt workspace int64 ( #1094 )
...
Signed-off-by: Manix <manickavela1998@gmail.com >
2024-07-08 20:38:16 +08:00
Manix
d6fbecd947
parse option in64_t ( #1089 )
...
Signed-off-by: Manix <manickavela1998@gmail.com >
2024-07-08 15:37:30 +08:00
Fangjun Kuang
a25075101c
Build sherpa-onnx as a single shared library ( #1078 )
...
When `-D BUILD_SHARED_LIBS=ON` is passed to `cmake`, it builds a single shared library.
Specifically,
- For C APIs, it builds `libsherpa-onnx-c-api.so`
- For Python APIs, it builds `_sherpa_onnx.cpython-xx-xx.so`
- For Kotlin and Java APIs, it builds `libsherpa-onnx-jni.so`
There is no `libsherpa-onnx-core.so` any longer.
Note it affects only shared libraries.
2024-07-06 16:41:54 +08:00
Manix
55decb7bee
Add config for TensorRT and CUDA execution provider ( #992 )
...
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com >
Signed-off-by: manickavela1998@gmail.com <manickavela.arumugam@uniphore.com >
2024-07-05 15:18:37 +08:00
Fangjun Kuang
6cb018184e
Fix for silero vad v5. ( #1065 )
...
The network input is 64 + 512 samples instead of 512 samples for 16kHz.
2024-06-30 08:57:23 +08:00
Fangjun Kuang
61c7eb3063
Support silero_vad version 5 ( #1064 )
2024-06-29 11:45:04 +08:00
Fangjun Kuang
598c12c4e5
Fix CI tests ( #1061 )
2024-06-27 18:05:18 +08:00
Fangjun Kuang
a3bac19c54
fix a bug for wenet streaming model. ( #1054 )
...
* fix a bug for wenet streaming model.
The chunk shift was wrong.
See
https://github.com/wenet-e2e/wenet/blob/main/runtime/core/decoder/asr_model.cc#L15
and
https://github.com/wenet-e2e/wenet/blob/main/runtime/core/decoder/asr_model.cc#L28
2024-06-24 21:52:54 +08:00
Fangjun Kuang
9dd0e03568
Enable to stop TTS generation ( #1041 )
2024-06-22 18:18:36 +08:00
Zhong-Yi Li
675fb1574f
offline transducer: treat unk as blank ( #1005 )
...
Co-authored-by: chungyi.li <chungyi.li@ailabs.tw >
2024-06-19 20:52:42 +08:00
Fangjun Kuang
a11c859971
Support clang-tidy ( #1034 )
2024-06-19 20:51:57 +08:00
Fangjun Kuang
6789c909d2
Inverse text normalization API of streaming ASR for various programming languages ( #1022 )
2024-06-18 13:42:17 +08:00
Fangjun Kuang
349d957da2
Add inverse text normalization for online ASR ( #1020 )
2024-06-17 18:39:23 +08:00
Fangjun Kuang
b0f7ed3ee3
Add inverse text normalization for non-streaming ASR ( #1017 )
2024-06-17 14:28:53 +08:00
Fangjun Kuang
e1201225f2
Add Android APK for Korean ( #1015 )
2024-06-16 19:17:15 +08:00
Ikko Eltociear Ashimine
155f22d511
Update features.h ( #994 )
2024-06-12 15:47:44 +08:00
Fangjun Kuang
208da78343
Limit the maximum segment length for VAD. ( #990 )
2024-06-12 10:49:37 +08:00
Fangjun Kuang
1a43d1e37f
Support getting word IDs for CTC HLG decoding. ( #978 )
2024-06-06 14:22:39 +08:00
Manix
69347ffc8f
Support TensorRT provider ( #921 )
...
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com >
Signed-off-by: manickavela1998@gmail.com <manickavela.arumugam@uniphore.com >
2024-06-06 10:45:28 +08:00
Fangjun Kuang
7e0931c762
Fix punctuation ( #976 )
2024-06-05 11:23:19 +08:00
Fangjun Kuang
fd5a0d1e00
Add C++ runtime for Tele-AI/TeleSpeech-ASR ( #970 )
2024-06-05 00:26:40 +08:00
Fangjun Kuang
f1cff83ef9
Add address sanitizer and undefined behavior sanitizer ( #951 )
2024-05-31 13:17:01 +08:00