Commit Graph

46 Commits

Author SHA1 Message Date
Fangjun Kuang
6686c7d3e6 Add dict_dir arg to c api to support Chinese TTS models using jieba (#809) 2024-04-25 12:28:31 +08:00
Fangjun Kuang
c1608b3524 Support CED models (#792) 2024-04-19 15:20:37 +08:00
Fangjun Kuang
13730ecbd8 Add C API for punctuation (#768) 2024-04-14 19:02:34 +08:00
Fangjun Kuang
f204e62b44 Add C API for audio tagging (#754) 2024-04-11 14:18:43 +08:00
Fangjun Kuang
a5f8fbc83f Support heteronyms in Chinese TTS (#738) 2024-04-08 11:01:30 +08:00
Fangjun Kuang
c1c0f5bafd return timestamps for WebAssembly (#737) 2024-04-05 20:24:27 +08:00
Fangjun Kuang
dbff2eaadb Add C API for streaming HLG decoding (#734) 2024-04-05 10:31:20 +08:00
Fangjun Kuang
2e0bccad36 Add C API for speaker embedding extractor. (#711) 2024-03-28 18:05:40 +08:00
Leo Huang
638f48f47a Added progress for callback of tts generator (#712)
Co-authored-by: leohwang <leohwang@360converter.com>
2024-03-28 17:12:20 +08:00
Fangjun Kuang
ab7cff2513 Add C API for spoken language identification. (#695) 2024-03-25 15:16:47 +08:00
Fangjun Kuang
1952772654 Add timestamps and tokens for .Net's online models. (#690) 2024-03-23 18:51:56 +08:00
Fangjun Kuang
acf0975153 Support whisper language/task in various language bindings. (#679) 2024-03-20 16:43:35 +08:00
Viggo
842d04d7ae support whisper language (#678) 2024-03-20 10:16:22 +08:00
xinhecuican
f43139e803 c++ api for keyword spotter (#642) 2024-03-11 10:23:46 +08:00
Fangjun Kuang
3232dff2cf Support user provided data in tts callback. (#653) 2024-03-09 18:15:03 +08:00
Fangjun Kuang
d771762868 Support WebAssembly for text-to-speech (#577) 2024-02-08 23:39:12 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
99ff6a834c Play generated audio as it is generating. (#457) 2023-12-02 15:35:11 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
fe977b8e8e support nodejs (#438) 2023-11-21 23:20:08 +08:00
Fangjun Kuang
1937717705 Add MFC TTS example on Windows (#378) 2023-10-21 00:13:07 +08:00
Fangjun Kuang
a69d0a950e Add Go API for TTS (#377) 2023-10-20 15:57:52 +08:00
Fangjun Kuang
ea7c45b60c Add C API for offline TTS. (#373) 2023-10-19 17:38:23 +08:00
Fangjun Kuang
eead16e27f Fix CI for pip install (#371) 2023-10-19 10:43:14 +08:00
yujinqiu
d01682d968 Add vad clear api for better performance (#366)
* Add vad clear api for better performance

* rename to make naming consistent and remove macro

* Fix linker error

* Fix Vad.kt
2023-10-16 14:40:47 +08:00
yujinqiu
f6566c8ace Expose VAD isDetected api to Swift (#356) 2023-10-12 15:11:58 +08:00
Fangjun Kuang
cf199ad466 Support onnxruntime 1.16.0 (#330) 2023-09-21 20:39:24 +08:00
Nick Fisher
b3e9986825 Add CreateOnlineStreamWithHotwords to C API (#323)
* add default visibility to SHERPA_ONNX_EXPORT

* expose CreateOnlineStreamWithHotwords method via C API

Co-authored-by: Nick Fisher <nick.fisher@polyvox.app>
2023-09-19 17:32:42 +08:00
Wei Kang
a5d1c90807 Support c-api (#317) 2023-09-18 16:24:57 +08:00
Fangjun Kuang
692a47dd80 Add Swift example for generating subtitles (#318) 2023-09-18 15:16:54 +08:00
Fangjun Kuang
e2be532b32 Add timestamps for offline paraformer (#310) 2023-09-14 19:33:41 +08:00
Fangjun Kuang
e31f9e48c2 Fix various language binding APIs for tdnn and whisper models (#278) 2023-08-16 22:15:10 +08:00
Fangjun Kuang
bc791d4996 Fix C api for Go and MFC to support streaming paraformer (#268) 2023-08-14 17:02:23 +08:00
Fangjun Kuang
a8bdb4b38a Support paraformer on iOS (#265)
* Fix C API to support streaming paraformer

* Fix Swift API

* Support paraformer in iOS
2023-08-14 14:38:41 +08:00
Wilson Wongso
64efbd82af Implement Tokens in Swift and Kotlin (#227)
Co-authored-by: duc <duc@appiphany.com.au>
2023-08-05 18:37:03 +08:00
Fangjun Kuang
6125d9e063 Refactor onnxruntime.cmake (#220) 2023-07-18 15:44:54 +08:00
Fangjun Kuang
de2673680e Fix model_type for jni, c# and iOS. (#216) 2023-07-14 22:24:38 +08:00
Wilson Wongso
5a6b55c5a7 Reduce model initialization time for online speech recognition (#215)
* Reduce model initialization time for online speech recognition

* Fixed Styling

---------

Co-authored-by: w11wo <wilsowong961@gmail.com>
2023-07-14 21:20:10 +08:00
Fangjun Kuang
f3206c49dc Reduce model initialization time for offline speech recognition (#213) 2023-07-14 18:07:27 +08:00
Jingzhao Ou
0ed501b8f1 Added provider option to sherpa-onnx and decode-file-c-api (#162) 2023-06-03 04:57:48 +08:00
Fangjun Kuang
959f13eac8 Fix typos in .Net APIs (#156) 2023-05-14 22:32:01 +08:00
Fangjun Kuang
7969cf44ac Refactor C# code and support building nuget packages for cross-platforms (#144) 2023-05-10 14:53:04 +08:00
Fangjun Kuang
9d8fddef01 Support resampling (#77) 2023-03-03 16:42:33 +08:00
Fangjun Kuang
5f31b22c12 Fix modified beam search for iOS and android (#76)
* Use Int type for sampling rate

* Fix swift

* Fix iOS
2023-03-03 15:18:31 +08:00
Fangjun Kuang
7f72c13d9a Code refactoring (#74)
* Don't reset model state and feature extractor on endpointing

* support passing decoding_method from commandline

* Add modified_beam_search to Python API

* fix C API example

* Fix style issues
2023-03-03 12:10:59 +08:00
Fangjun Kuang
c63c4c3389 C api (#60) 2023-02-24 16:42:46 +08:00