Commit Graph

421 Commits

Author SHA1 Message Date
Fangjun Kuang
7e0ae677c8 Add a Persian and a Slovenian model from Piper for Android TTS. (#531) 2024-01-15 15:00:15 +08:00
Fangjun Kuang
f4e3f45664 Fix setting speaker ID for Android TTS Engine. (#530) 2024-01-15 11:46:57 +08:00
Fangjun Kuang
229853b77e Android TTS APKs for Persian (#529) 2024-01-14 21:44:46 +08:00
Fangjun Kuang
2024e96639 Add C++ runtime for speaker verification models from NeMo (#527) 2024-01-13 21:42:09 +08:00
Fangjun Kuang
68a525a024 Export speaker verification models from NeMo to ONNX (#526) 2024-01-13 19:49:45 +08:00
Fangjun Kuang
afc81ec122 Add C++ runtime for models from 3d-speaker (#523) 2024-01-11 19:10:30 +08:00
Fangjun Kuang
ec728ff7f6 Fix publishing nuget packages. (#525) 2024-01-11 18:54:23 +08:00
Fangjun Kuang
07e2b9a36d Support exporting models to onnx from 3D-Speaker (#522) 2024-01-10 21:09:45 +08:00
Fangjun Kuang
55266918c8 Add runtime support for wespeaker models (#516) 2024-01-09 22:06:08 +08:00
Fangjun Kuang
902b21894b Use NDK 22.1 for android build (#518) 2024-01-05 20:34:01 +08:00
Fangjun Kuang
0be71a31f5 Use high_freq -400 in computing fbank features. (#515)
Fixes #514
2024-01-04 12:39:06 +08:00
Fangjun Kuang
547a22f7d9 Fix #510 (#513) 2024-01-04 12:32:19 +08:00
Fangjun Kuang
e215d0c39a Fix Byte BPE string results for Python. (#512)
It ignores invalid UTF8 strings.
2024-01-03 16:03:24 +08:00
Fangjun Kuang
d01142173a Add missing field for two-pass APK. (#511) 2024-01-03 12:51:54 +08:00
Fangjun Kuang
581eceb4d5 Build text-to-speech engine APKs (#509) 2024-01-01 12:44:20 +08:00
Fangjun Kuang
d7e10bb3f8 Replace Android system TTS engine (#508) 2023-12-31 23:02:35 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
7634f5f034 Release Python GIL in C++ class constructor (#493) 2023-12-20 15:54:32 +08:00
Fangjun Kuang
ef8d112aaa Fix whisper test script for the latest onnxruntime. (#494) 2023-12-20 11:12:12 +08:00
Fangjun Kuang
03ff9db56e Keep multiple threads from calling into espeak-ng at the same time (#489) 2023-12-15 17:44:33 +08:00
Fangjun Kuang
ad72e7afc3 Print informative error messages for sherpa-onnx-alsa on errors. (#486) 2023-12-15 11:10:39 +08:00
Fangjun Kuang
33c03f78b2 Fix CI (#485) 2023-12-15 10:25:03 +08:00
Fangjun Kuang
9ff6185b7c fix building linux x86 wheels (#484) 2023-12-14 21:37:40 +08:00
Fangjun Kuang
b18812ceff Play generated audio using alsa for TTS (#482) 2023-12-13 22:28:03 +08:00
Fangjun Kuang
9829d7c4d3 Add two GLaDOS TTS models (#481) 2023-12-13 15:40:07 +08:00
Fangjun Kuang
80d0192325 Fix android tts audio buffer size and fix CI. (#478) 2023-12-10 18:25:50 +08:00
Fangjun Kuang
0f053d8040 Support playing as it is generating for Android (#477) 2023-12-09 16:36:38 +08:00
Fangjun Kuang
cae0231f93 Fix releasing go packages (#476) 2023-12-09 00:07:52 +08:00
Fangjun Kuang
aef74c5125 convert wespeaker models to sherpa-onnx (#475) 2023-12-08 19:32:29 +08:00
Fangjun Kuang
0e23f82691 Give an informative log for whisper on exceptions. (#473) 2023-12-08 14:33:59 +08:00
Fangjun Kuang
868c339e5e Support distil-small.en whisper (#472) 2023-12-08 11:59:20 +08:00
Fangjun Kuang
3ae984f148 Remove the 30-second constraint from whisper. (#471) 2023-12-07 17:47:08 +08:00
Fangjun Kuang
a7d69359c9 Release v1.9.0 (#470) 2023-12-06 19:46:50 +08:00
Fangjun Kuang
d34161413d Support Ukrainian VITS models from coqui-ai/TTS (#469) 2023-12-06 19:37:11 +08:00
Fangjun Kuang
23cf92daf7 Use espeak-ng for coqui-ai/TTS VITS English models. (#466) 2023-12-06 11:00:38 +08:00
Fangjun Kuang
3b90e85ef2 Fix building for .Net (#463) 2023-12-04 19:27:55 +08:00
Fangjun Kuang
73afa0248b Support playing generated audio as it is generating for MFC. (#462)
* Support playing generated audio as it is generating for MFC.

* support espeak-ng-data
2023-12-04 14:23:38 +08:00
Fangjun Kuang
86b4be5260 Break text into sentences for tts. (#460)
This is for models that are not using piper-phonemize as their front-end.
2023-12-03 11:50:25 +08:00
Fangjun Kuang
99ff6a834c Play generated audio as it is generating. (#457) 2023-12-02 15:35:11 +08:00
Fangjun Kuang
539b27e575 Fix CI (#456) 2023-12-01 11:00:16 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
db41778e99 Support piper-phonemize (#452) 2023-11-28 19:12:58 +08:00
Fangjun Kuang
87a47d7db4 Release GIL to support multithreading in websocket servers. (#451) 2023-11-27 13:44:03 +08:00
Fangjun Kuang
8dc08a9b97 Fix nodejs on Windows (#450) 2023-11-25 21:23:15 +08:00
Fangjun Kuang
66cad9fa93 Fix reading tokens.txt on Windows (#448) 2023-11-25 14:22:26 +08:00
Fangjun Kuang
8444d54c4e Update to onnxruntime 1.16.3 (#446) 2023-11-24 14:39:03 +08:00
HieDean
2a91524dbf Lock before push_back the deque for thread safety (#445)
Co-authored-by: hiedean <hiedean@tju.edu.cn>
2023-11-24 10:23:25 +08:00
Fangjun Kuang
94ef6929bb Text-to-speech for iOS (#443) 2023-11-23 21:38:32 +08:00
Fangjun Kuang
2f22e6ed63 Add Swift API for TTS (#439) 2023-11-22 16:04:26 +08:00
Fangjun Kuang
fe977b8e8e support nodejs (#438) 2023-11-21 23:20:08 +08:00