Commit Graph

359 Commits

Author SHA1 Message Date
Fangjun Kuang
e215d0c39a Fix Byte BPE string results for Python. (#512)
It ignores invalid UTF8 strings.
2024-01-03 16:03:24 +08:00
Fangjun Kuang
d01142173a Add missing field for two-pass APK. (#511) 2024-01-03 12:51:54 +08:00
Fangjun Kuang
581eceb4d5 Build text-to-speech engine APKs (#509) 2024-01-01 12:44:20 +08:00
Fangjun Kuang
d7e10bb3f8 Replace Android system TTS engine (#508) 2023-12-31 23:02:35 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
7634f5f034 Release Python GIL in C++ class constructor (#493) 2023-12-20 15:54:32 +08:00
Fangjun Kuang
ef8d112aaa Fix whisper test script for the latest onnxruntime. (#494) 2023-12-20 11:12:12 +08:00
Fangjun Kuang
03ff9db56e Keep multiple threads from calling into espeak-ng at the same time (#489) 2023-12-15 17:44:33 +08:00
Fangjun Kuang
ad72e7afc3 Print informative error messages for sherpa-onnx-alsa on errors. (#486) 2023-12-15 11:10:39 +08:00
Fangjun Kuang
33c03f78b2 Fix CI (#485) 2023-12-15 10:25:03 +08:00
Fangjun Kuang
9ff6185b7c fix building linux x86 wheels (#484) 2023-12-14 21:37:40 +08:00
Fangjun Kuang
b18812ceff Play generated audio using alsa for TTS (#482) 2023-12-13 22:28:03 +08:00
Fangjun Kuang
9829d7c4d3 Add two GLaDOS TTS models (#481) 2023-12-13 15:40:07 +08:00
Fangjun Kuang
80d0192325 Fix android tts audio buffer size and fix CI. (#478) 2023-12-10 18:25:50 +08:00
Fangjun Kuang
0f053d8040 Support playing as it is generating for Android (#477) 2023-12-09 16:36:38 +08:00
Fangjun Kuang
cae0231f93 Fix releasing go packages (#476) 2023-12-09 00:07:52 +08:00
Fangjun Kuang
aef74c5125 convert wespeaker models to sherpa-onnx (#475) 2023-12-08 19:32:29 +08:00
Fangjun Kuang
0e23f82691 Give an informative log for whisper on exceptions. (#473) 2023-12-08 14:33:59 +08:00
Fangjun Kuang
868c339e5e Support distil-small.en whisper (#472) 2023-12-08 11:59:20 +08:00
Fangjun Kuang
3ae984f148 Remove the 30-second constraint from whisper. (#471) 2023-12-07 17:47:08 +08:00
Fangjun Kuang
a7d69359c9 Release v1.9.0 (#470) 2023-12-06 19:46:50 +08:00
Fangjun Kuang
d34161413d Support Ukrainian VITS models from coqui-ai/TTS (#469) 2023-12-06 19:37:11 +08:00
Fangjun Kuang
23cf92daf7 Use espeak-ng for coqui-ai/TTS VITS English models. (#466) 2023-12-06 11:00:38 +08:00
Fangjun Kuang
3b90e85ef2 Fix building for .Net (#463) 2023-12-04 19:27:55 +08:00
Fangjun Kuang
73afa0248b Support playing generated audio as it is generating for MFC. (#462)
* Support playing generated audio as it is generating for MFC.

* support espeak-ng-data
2023-12-04 14:23:38 +08:00
Fangjun Kuang
86b4be5260 Break text into sentences for tts. (#460)
This is for models that are not using piper-phonemize as their front-end.
2023-12-03 11:50:25 +08:00
Fangjun Kuang
99ff6a834c Play generated audio as it is generating. (#457) 2023-12-02 15:35:11 +08:00
Fangjun Kuang
539b27e575 Fix CI (#456) 2023-12-01 11:00:16 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
db41778e99 Support piper-phonemize (#452) 2023-11-28 19:12:58 +08:00
Fangjun Kuang
87a47d7db4 Release GIL to support multithreading in websocket servers. (#451) 2023-11-27 13:44:03 +08:00
Fangjun Kuang
8dc08a9b97 Fix nodejs on Windows (#450) 2023-11-25 21:23:15 +08:00
Fangjun Kuang
66cad9fa93 Fix reading tokens.txt on Windows (#448) 2023-11-25 14:22:26 +08:00
Fangjun Kuang
8444d54c4e Update to onnxruntime 1.16.3 (#446) 2023-11-24 14:39:03 +08:00
HieDean
2a91524dbf Lock before push_back the deque for thread safety (#445)
Co-authored-by: hiedean <hiedean@tju.edu.cn>
2023-11-24 10:23:25 +08:00
Fangjun Kuang
94ef6929bb Text-to-speech for iOS (#443) 2023-11-23 21:38:32 +08:00
Fangjun Kuang
2f22e6ed63 Add Swift API for TTS (#439) 2023-11-22 16:04:26 +08:00
Fangjun Kuang
fe977b8e8e support nodejs (#438) 2023-11-21 23:20:08 +08:00
Fangjun Kuang
38ad05bdf8 Refactor building wheels (#436) 2023-11-20 12:33:06 +08:00
HieDean
e6a2d0da3b Replace Clone() with View() (#432)
Co-authored-by: hiedean <hiedean@tju.edu.cn>
2023-11-20 09:20:50 +08:00
Fangjun Kuang
ac00edab5b Build MFC examples for Windows x86 (Win32) (#434)
Also, strip binaries on Linux before uploading.
2023-11-18 16:13:09 +08:00
HieDean
1a6a41eb2c Judge before UseCachedDecoderOut (#431)
Co-authored-by: hiedean <hiedean@tju.edu.cn>
2023-11-17 12:07:47 +08:00
Fangjun Kuang
eeda1e190e Build building for iOS (#430) 2023-11-16 21:14:25 +08:00
Fangjun Kuang
049fb9f451 Add Python APIs for WeNet CTC models (#428) 2023-11-16 14:20:41 +08:00
Fangjun Kuang
fac4f6bc7c Support streaming conformer CTC models from wenet (#427) 2023-11-16 10:35:23 +08:00
Fangjun Kuang
b83b3e3cd1 Support non-streaming WeNet CTC models. (#426) 2023-11-15 14:23:20 +08:00
Fangjun Kuang
d34640e3a3 Add scripts to export ASR models from wenet to ONNX (#425)
See
https://user-images.githubusercontent.com/5284924/282995968-f6d39118-8008-4ce7-9d7c-d1d6387ac183.png
2023-11-15 11:41:15 +08:00
Fangjun Kuang
097d641869 Resize circular buffer on overflow (#422) 2023-11-13 12:07:51 +08:00
Fangjun Kuang
9884cf71e7 Update onnxruntime to v1.16.2 (#421) 2023-11-12 11:29:33 +08:00
Fangjun Kuang
68f0e59688 Add a C++ example to show streaming VAD + non-streaming ASR. (#420) 2023-11-11 22:54:27 +08:00