Fangjun Kuang
9efe69720d
Support VITS VCTK models ( #367 )
...
* Support VITS VCTK models
* Release v1.8.1
2023-10-16 17:22:30 +08:00
yujinqiu
d01682d968
Add vad clear api for better performance ( #366 )
...
* Add vad clear api for better performance
* rename to make naming consistent and remove macro
* Fix linker error
* Fix Vad.kt
2023-10-16 14:40:47 +08:00
Fangjun Kuang
655e0fa836
add python API and examples for TTS ( #364 )
2023-10-14 14:21:53 +08:00
Fangjun Kuang
1ac2232e14
Support writing generated audio samples to wave files ( #363 )
2023-10-13 23:36:03 +08:00
Fangjun Kuang
536d5804ba
Add TTS with VITS ( #360 )
2023-10-13 19:30:38 +08:00
Peng He
4771c9275c
Add lm decode for the Python API. ( #353 )
...
* Add lm decode for the Python API.
* fix style.
* Fix LogAdd,
Shouldn't double lm_log_prob when merge same prefix path
* sort the import alphabetically
2023-10-13 11:15:16 +08:00
Fangjun Kuang
323f532ad2
Fix symbol table for byte bpe ( #361 )
2023-10-13 10:51:59 +08:00
Fangjun Kuang
98b67ad850
Fix reading hotwords file for android ( #354 )
2023-10-11 12:20:50 +08:00
Fangjun Kuang
be081017de
Fix typos/bugs ( #351 )
2023-10-08 11:39:59 +08:00
Fangjun Kuang
407602445d
Add CTC HLG decoding using OpenFst ( #349 )
2023-10-08 11:32:39 +08:00
Nickolay V. Shmyrev
c12286fe5e
Proper convolution mode for fast GPU processing ( #350 )
2023-10-07 20:24:57 +08:00
Fangjun Kuang
33a5765169
Print a more user-friendly error message when using --hotwords-file. ( #344 )
2023-09-26 11:04:20 +08:00
Fangjun Kuang
552a267c23
Set is_final and start_time for online websocket server. ( #342 )
...
* Set is_final and start_time for online websocket server.
* Convert timestamps to a json array
2023-09-25 15:12:07 +08:00
poor1017
c2518a5826
Supports cmake compilation compatible with v3.13. ( #340 )
...
Co-authored-by: chenyu <cheny65@chinatelecom.cn >
2023-09-25 11:48:55 +08:00
dym21
fef61080de
Added #include <cstdint> to fix gcc 13.2 compilation error. ( #339 )
2023-09-25 10:38:26 +08:00
Fangjun Kuang
6e60a77d89
Add Android APK for Silero VAD ( #335 )
2023-09-23 20:39:13 +08:00
Fangjun Kuang
43b2b7760d
Fix tokens processing for byte-level BPE ( #333 )
2023-09-22 13:28:19 +08:00
Fangjun Kuang
532ed142d2
Support linking onnxruntime lib statically on Linux ( #326 )
2023-09-21 10:15:42 +08:00
Fangjun Kuang
6afa9c85f6
Fix tokens for byte-level BPE token. ( #324 )
2023-09-20 07:49:53 +08:00
keanu
bd173b27cc
Offline decode support multi threads ( #306 )
...
Co-authored-by: cuidongcai1035 <cuidongcai1035@wezhuiyi.com >
2023-09-19 21:04:13 +08:00
Fangjun Kuang
692a47dd80
Add Swift example for generating subtitles ( #318 )
2023-09-18 15:16:54 +08:00
Peng He
5ca0ff8811
Fix LogAdd ( #316 )
...
Using 0 as the initial value, should not perform addition when both values are 0
2023-09-18 10:43:04 +08:00
Fangjun Kuang
c471423125
Add Silero VAD ( #313 )
2023-09-17 14:54:38 +08:00
Fangjun Kuang
e2be532b32
Add timestamps for offline paraformer ( #310 )
2023-09-14 19:33:41 +08:00
Wei Kang
47184f9db7
Refactor hotwords,support loading hotwords from file ( #296 )
2023-09-14 19:33:17 +08:00
Fangjun Kuang
d46b7ec178
Catch exception from non-streaming paraformer. ( #307 )
2023-09-12 16:44:33 +08:00
Fangjun Kuang
debab7c091
Add two-pass speech recognition Android/iOS demo ( #304 )
2023-09-12 15:40:16 +08:00
Fangjun Kuang
a12ebfab22
treat unk as blank ( #299 )
2023-09-07 15:12:29 +08:00
Fangjun Kuang
a0a747a0c0
add endpointing for online websocket server ( #294 )
2023-08-31 14:41:04 +08:00
Wei Kang
2b0152d2a2
Fix context graph ( #292 )
2023-08-28 19:39:22 +08:00
Fangjun Kuang
eb22b4845a
Fix a bug for multilingual ASR ( #281 )
2023-08-17 10:43:26 +08:00
Fangjun Kuang
f709c95c5f
Support multilingual whisper models ( #274 )
2023-08-16 00:28:52 +08:00
Fangjun Kuang
35526e26e1
Support paraformer on Android ( #264 )
2023-08-14 12:26:15 +08:00
Fangjun Kuang
6038e2aa62
Support streaming paraformer ( #263 )
2023-08-14 10:32:14 +08:00
Fangjun Kuang
a4bff28e21
Support TDNN models from the yesno recipe from icefall ( #262 )
2023-08-12 19:50:22 +08:00
frankyoujian
9dcad7e963
Reinitialize context state after Reset stream when using contexts ( #257 )
2023-08-10 14:19:40 +08:00
Fangjun Kuang
865fd1e017
Support pkg-config ( #253 )
2023-08-10 11:22:36 +08:00
Fangjun Kuang
79c2ce5dd4
Refactor online recognizer ( #250 )
...
* Refactor online recognizer.
Make it easier to support other streaming models.
Note that it is a breaking change for the Python API.
`sherpa_onnx.OnlineRecognizer()` used before should be
replaced by `sherpa_onnx.OnlineRecognizer.from_transducer()`.
2023-08-09 20:27:31 +08:00
Fangjun Kuang
6061318e3f
fix building on linux with GPU ( #249 )
2023-08-09 20:21:28 +08:00
Fangjun Kuang
92bfee0424
Flush stderr on write ( #248 )
2023-08-09 15:33:01 +08:00
Fangjun Kuang
aa48b76d4b
Fix initial tokens to decoding ( #246 )
2023-08-09 12:33:47 +08:00
Fangjun Kuang
45b9d4ab37
Support whisper models ( #238 )
2023-08-07 12:34:18 +08:00
Fangjun Kuang
c5756734a9
Use parse options to parse arguments from sherpa-onnx-microphone ( #237 )
2023-08-05 18:05:18 +08:00
Jingzhao Ou
daffdab52a
Updated hypothesis key generation to be the same as sherpa ( #226 )
2023-07-28 14:19:49 +08:00
Fangjun Kuang
6125d9e063
Refactor onnxruntime.cmake ( #220 )
2023-07-18 15:44:54 +08:00
Wilson Wongso
5a6b55c5a7
Reduce model initialization time for online speech recognition ( #215 )
...
* Reduce model initialization time for online speech recognition
* Fixed Styling
---------
Co-authored-by: w11wo <wilsowong961@gmail.com >
2023-07-14 21:20:10 +08:00
Fangjun Kuang
f3206c49dc
Reduce model initialization time for offline speech recognition ( #213 )
2023-07-14 18:07:27 +08:00
Fangjun Kuang
bebc1f1398
Use static libraries for MFC examples ( #210 )
2023-07-13 14:52:43 +08:00
Wei Kang
513dfaa552
Support contextual-biasing for streaming model ( #184 )
...
* Support contextual-biasing for streaming model
* The whole pipeline runs normally
* Fix comments
2023-06-30 16:46:24 +08:00
danfu
1c3dac9001
support streaming zipformer2 ( #185 )
...
Co-authored-by: danfu <danfu@tencent.com >
2023-06-26 11:09:43 +08:00