Commit Graph

78 Commits

Author SHA1 Message Date
Fangjun Kuang
4e040c596e Support including TTS conditionally. (#699) 2024-03-26 17:21:35 +08:00
Fangjun Kuang
0d258dd150 Support spoken language identification with whisper (#694) 2024-03-24 22:57:00 +08:00
Wei Kang
734bbd91dc Add Python API for keyword spotting (#576)
* Add alsa & microphone support for keyword spotting

* Add python wrapper
2024-03-01 09:31:11 +08:00
Fangjun Kuang
87a7030c08 Support using alsa to access the microphone with non-streaming ASR models (#517) 2024-02-26 21:17:26 +08:00
Fangjun Kuang
67acd34dcd Use alsa to read microphone in speaker identification demo. (#605) 2024-02-23 19:27:51 +08:00
Fangjun Kuang
099a0ccae3 Link the math lib. (#592) 2024-02-21 15:36:54 +08:00
Fangjun Kuang
d771762868 Support WebAssembly for text-to-speech (#577) 2024-02-08 23:39:12 +08:00
Fangjun Kuang
0b18ccfbb2 C++ API demo for speaker identification with portaudio. (#561) 2024-01-30 11:21:43 +08:00
Wei Kang
b6c020901a decoder for open vocabulary keyword spotting (#505)
* various fixes to ContextGraph to support open vocabulary keywords decoder

* Add keyword spotter runtime

* Add binary

* First version works

* Minor fixes

* update text2token

* default values

* Add jni for kws

* add kws android project

* Minor fixes

* Remove unused interface

* Minor fixes

* Add workflow

* handle extra info in texts

* Minor fixes

* Add more comments

* Fix ci

* fix cpp style

* Add input box in android demo so that users can specify their keywords

* Fix cpp style

* Fix comments

* Minor fixes

* Minor fixes

* minor fixes

* Minor fixes

* Minor fixes

* Add CI

* Fix code style

* cpplint

* Fix comments

* Fix error
2024-01-20 22:52:41 +08:00
Fangjun Kuang
2024e96639 Add C++ runtime for speaker verification models from NeMo (#527) 2024-01-13 21:42:09 +08:00
Fangjun Kuang
afc81ec122 Add C++ runtime for models from 3d-speaker (#523) 2024-01-11 19:10:30 +08:00
Fangjun Kuang
55266918c8 Add runtime support for wespeaker models (#516) 2024-01-09 22:06:08 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
b18812ceff Play generated audio using alsa for TTS (#482) 2023-12-13 22:28:03 +08:00
Fangjun Kuang
d34161413d Support Ukrainian VITS models from coqui-ai/TTS (#469) 2023-12-06 19:37:11 +08:00
Fangjun Kuang
99ff6a834c Play generated audio as it is generating. (#457) 2023-12-02 15:35:11 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
db41778e99 Support piper-phonemize (#452) 2023-11-28 19:12:58 +08:00
Fangjun Kuang
fac4f6bc7c Support streaming conformer CTC models from wenet (#427) 2023-11-16 10:35:23 +08:00
Fangjun Kuang
b83b3e3cd1 Support non-streaming WeNet CTC models. (#426) 2023-11-15 14:23:20 +08:00
Fangjun Kuang
68f0e59688 Add a C++ example to show streaming VAD + non-streaming ASR. (#420) 2023-11-11 22:54:27 +08:00
Fangjun Kuang
b80b7e5144 Support linking onnxruntime statically for macOS (#403) 2023-10-31 20:24:43 +08:00
Fangjun Kuang
fabbc70633 Support static linking onnxruntime for 64-bit ARM (#402) 2023-10-31 16:51:04 +08:00
Fangjun Kuang
2f2d3bbd82 Support static linking onnxruntime lib for 32-bit arm (#401) 2023-10-31 11:19:01 +08:00
Fangjun Kuang
1ee79e3ff5 Support Chinese vits models (#368) 2023-10-18 10:19:10 +08:00
Fangjun Kuang
1ac2232e14 Support writing generated audio samples to wave files (#363) 2023-10-13 23:36:03 +08:00
Fangjun Kuang
536d5804ba Add TTS with VITS (#360) 2023-10-13 19:30:38 +08:00
Fangjun Kuang
407602445d Add CTC HLG decoding using OpenFst (#349) 2023-10-08 11:32:39 +08:00
poor1017
c2518a5826 Supports cmake compilation compatible with v3.13. (#340)
Co-authored-by: chenyu <cheny65@chinatelecom.cn>
2023-09-25 11:48:55 +08:00
Fangjun Kuang
532ed142d2 Support linking onnxruntime lib statically on Linux (#326) 2023-09-21 10:15:42 +08:00
keanu
bd173b27cc Offline decode support multi threads (#306)
Co-authored-by: cuidongcai1035 <cuidongcai1035@wezhuiyi.com>
2023-09-19 21:04:13 +08:00
Fangjun Kuang
c471423125 Add Silero VAD (#313) 2023-09-17 14:54:38 +08:00
Wei Kang
47184f9db7 Refactor hotwords,support loading hotwords from file (#296) 2023-09-14 19:33:17 +08:00
Fangjun Kuang
6038e2aa62 Support streaming paraformer (#263) 2023-08-14 10:32:14 +08:00
Fangjun Kuang
a4bff28e21 Support TDNN models from the yesno recipe from icefall (#262) 2023-08-12 19:50:22 +08:00
Fangjun Kuang
79c2ce5dd4 Refactor online recognizer (#250)
* Refactor online recognizer.

Make it easier to support other streaming models.

Note that it is a breaking change for the Python API.
`sherpa_onnx.OnlineRecognizer()` used before should be
replaced by `sherpa_onnx.OnlineRecognizer.from_transducer()`.
2023-08-09 20:27:31 +08:00
Fangjun Kuang
6061318e3f fix building on linux with GPU (#249) 2023-08-09 20:21:28 +08:00
Fangjun Kuang
45b9d4ab37 Support whisper models (#238) 2023-08-07 12:34:18 +08:00
Fangjun Kuang
6125d9e063 Refactor onnxruntime.cmake (#220) 2023-07-18 15:44:54 +08:00
Fangjun Kuang
bebc1f1398 Use static libraries for MFC examples (#210) 2023-07-13 14:52:43 +08:00
danfu
1c3dac9001 support streaming zipformer2 (#185)
Co-authored-by: danfu <danfu@tencent.com>
2023-06-26 11:09:43 +08:00
Wei Kang
8562711252 Implement context biasing with a Aho Corasick automata (#145)
* Implement context graph

* Modify the interface to support context biasing

* Support context biasing in modified beam search; add python wrapper

* Support context biasing in python api example

* Minor fixes

* Fix context graph

* Minor fixes

* Fix tests

* Fix style

* Fix style

* Fix comments

* Minor fixes

* Add missing header

* Replace std::shared_ptr with std::unique_ptr for effciency

* Build graph in constructor

* Fix comments

* Minor fixes

* Fix docs
2023-06-16 14:26:36 +08:00
Yuekai Zhang
b8fbf8e5ce Add onnxruntime gpu for cmake (#153)
* add onnxruntime gpu for cmake

* fix clang

* fix typo

* cpplint
2023-05-12 22:30:47 +08:00
Fangjun Kuang
cea718e3d8 Support CoreML for macOS (#151) 2023-05-12 15:57:44 +08:00
Jingzhao Ou
0992063de8 Stack and streaming conformer support (#141)
* added csrc/stack.cc

* stack: added checks

* added copyright info

* passed cpp style checks

* formatted code

* added some support for streaming conformer model support (not verified)

* code lint

* made more progress with streaming conformer support (not working yet)

* passed style check

* changes as suggested by @csukuangfj

* added some debug info

* fixed style check

* Use Cat to replace Stack

* remove debug statements

---------

Co-authored-by: Jingzhao Ou (jou2019) <jou2019@cisco.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-05-11 14:30:39 +08:00
PF Luo
8c6a6768d5 Add lm rescore to online-modified-beam-search (#133) 2023-05-05 21:23:54 +08:00
Fangjun Kuang
86017f9833 Add RNN LM rescore for offline ASR with modified_beam_search (#125) 2023-04-23 17:15:18 +08:00
Fangjun Kuang
80060c276d Begin to support CTC models (#119)
Please see https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/index.html for a list of pre-trained CTC models from NeMo.
2023-04-07 23:11:34 +08:00
Fangjun Kuang
726680c5e0 Install binaries via pip install (#112)
When pepole use pip install sherpa-onnx, they also get the following binaries:

(py38) fangjuns-MacBook-Pro:bin fangjun$ ls -lh  sherpa-onnx*
-rwxr-xr-x  1 fangjun  staff    36K Apr  4 13:48 sherpa-onnx
-rwxr-xr-x  1 fangjun  staff    52K Apr  4 13:48 sherpa-onnx-microphone
-rwxr-xr-x  1 fangjun  staff    54K Apr  4 13:48 sherpa-onnx-microphone-offline
-rwxr-xr-x  1 fangjun  staff    37K Apr  4 13:48 sherpa-onnx-offline
-rwxr-xr-x  1 fangjun  staff   634K Apr  4 13:48 sherpa-onnx-offline-websocket-server
-rwxr-xr-x  1 fangjun  staff   710K Apr  4 13:48 sherpa-onnx-online-websocket-client
-rwxr-xr-x  1 fangjun  staff   651K Apr  4 13:48 sherpa-onnx-online-websocket-server
(py38) fangjuns-MacBook-Pro:bin fangjun$ pwd
/Users/fangjun/py38/bin
2023-04-04 15:45:59 +08:00
Fangjun Kuang
b911915a32 Add microphone support for offline recognizer (#104) 2023-03-30 19:43:05 +08:00