Fangjun Kuang
329fe1aa8b
Support adding punctuations to the speech recogntion result ( #761 )
2024-04-13 12:15:57 +08:00
Fangjun Kuang
042976ea6e
Add C++ microphone examples for audio tagging ( #749 )
2024-04-10 21:00:35 +08:00
Fangjun Kuang
f20291cadc
Support audio tagging using zipformer ( #747 )
2024-04-10 14:47:06 +08:00
Fangjun Kuang
6fb8ceda57
Add VAD examples using ALSA for recording ( #739 )
2024-04-08 16:41:01 +08:00
Fangjun Kuang
a5f8fbc83f
Support heteronyms in Chinese TTS ( #738 )
2024-04-08 11:01:30 +08:00
Fangjun Kuang
db67e00c77
Add HLG decoding for streaming CTC models ( #731 )
2024-04-03 21:31:42 +08:00
Fangjun Kuang
4e040c596e
Support including TTS conditionally. ( #699 )
2024-03-26 17:21:35 +08:00
Fangjun Kuang
0d258dd150
Support spoken language identification with whisper ( #694 )
2024-03-24 22:57:00 +08:00
Wei Kang
734bbd91dc
Add Python API for keyword spotting ( #576 )
...
* Add alsa & microphone support for keyword spotting
* Add python wrapper
2024-03-01 09:31:11 +08:00
Fangjun Kuang
87a7030c08
Support using alsa to access the microphone with non-streaming ASR models ( #517 )
2024-02-26 21:17:26 +08:00
Fangjun Kuang
67acd34dcd
Use alsa to read microphone in speaker identification demo. ( #605 )
2024-02-23 19:27:51 +08:00
Fangjun Kuang
099a0ccae3
Link the math lib. ( #592 )
2024-02-21 15:36:54 +08:00
Fangjun Kuang
d771762868
Support WebAssembly for text-to-speech ( #577 )
2024-02-08 23:39:12 +08:00
Fangjun Kuang
0b18ccfbb2
C++ API demo for speaker identification with portaudio. ( #561 )
2024-01-30 11:21:43 +08:00
Wei Kang
b6c020901a
decoder for open vocabulary keyword spotting ( #505 )
...
* various fixes to ContextGraph to support open vocabulary keywords decoder
* Add keyword spotter runtime
* Add binary
* First version works
* Minor fixes
* update text2token
* default values
* Add jni for kws
* add kws android project
* Minor fixes
* Remove unused interface
* Minor fixes
* Add workflow
* handle extra info in texts
* Minor fixes
* Add more comments
* Fix ci
* fix cpp style
* Add input box in android demo so that users can specify their keywords
* Fix cpp style
* Fix comments
* Minor fixes
* Minor fixes
* minor fixes
* Minor fixes
* Minor fixes
* Add CI
* Fix code style
* cpplint
* Fix comments
* Fix error
2024-01-20 22:52:41 +08:00
Fangjun Kuang
2024e96639
Add C++ runtime for speaker verification models from NeMo ( #527 )
2024-01-13 21:42:09 +08:00
Fangjun Kuang
afc81ec122
Add C++ runtime for models from 3d-speaker ( #523 )
2024-01-11 19:10:30 +08:00
Fangjun Kuang
55266918c8
Add runtime support for wespeaker models ( #516 )
2024-01-09 22:06:08 +08:00
Fangjun Kuang
e475e750ac
Support streaming zipformer CTC ( #496 )
...
* Support streaming zipformer CTC
* test online zipformer2 CTC
* Update doc of sherpa-onnx.cc
* Add Python APIs for streaming zipformer2 ctc
* Add Python API examples for streaming zipformer2 ctc
* Swift API for streaming zipformer2 CTC
* NodeJS API for streaming zipformer2 CTC
* Kotlin API for streaming zipformer2 CTC
* Golang API for streaming zipformer2 CTC
* C# API for streaming zipformer2 CTC
* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
b18812ceff
Play generated audio using alsa for TTS ( #482 )
2023-12-13 22:28:03 +08:00
Fangjun Kuang
d34161413d
Support Ukrainian VITS models from coqui-ai/TTS ( #469 )
2023-12-06 19:37:11 +08:00
Fangjun Kuang
99ff6a834c
Play generated audio as it is generating. ( #457 )
2023-12-02 15:35:11 +08:00
Fangjun Kuang
62dc3c3e46
Use piper-phonemize to convert text to token IDs ( #453 )
2023-11-30 23:57:43 +08:00
Fangjun Kuang
db41778e99
Support piper-phonemize ( #452 )
2023-11-28 19:12:58 +08:00
Fangjun Kuang
fac4f6bc7c
Support streaming conformer CTC models from wenet ( #427 )
2023-11-16 10:35:23 +08:00
Fangjun Kuang
b83b3e3cd1
Support non-streaming WeNet CTC models. ( #426 )
2023-11-15 14:23:20 +08:00
Fangjun Kuang
68f0e59688
Add a C++ example to show streaming VAD + non-streaming ASR. ( #420 )
2023-11-11 22:54:27 +08:00
Fangjun Kuang
b80b7e5144
Support linking onnxruntime statically for macOS ( #403 )
2023-10-31 20:24:43 +08:00
Fangjun Kuang
fabbc70633
Support static linking onnxruntime for 64-bit ARM ( #402 )
2023-10-31 16:51:04 +08:00
Fangjun Kuang
2f2d3bbd82
Support static linking onnxruntime lib for 32-bit arm ( #401 )
2023-10-31 11:19:01 +08:00
Fangjun Kuang
1ee79e3ff5
Support Chinese vits models ( #368 )
2023-10-18 10:19:10 +08:00
Fangjun Kuang
1ac2232e14
Support writing generated audio samples to wave files ( #363 )
2023-10-13 23:36:03 +08:00
Fangjun Kuang
536d5804ba
Add TTS with VITS ( #360 )
2023-10-13 19:30:38 +08:00
Fangjun Kuang
407602445d
Add CTC HLG decoding using OpenFst ( #349 )
2023-10-08 11:32:39 +08:00
poor1017
c2518a5826
Supports cmake compilation compatible with v3.13. ( #340 )
...
Co-authored-by: chenyu <cheny65@chinatelecom.cn >
2023-09-25 11:48:55 +08:00
Fangjun Kuang
532ed142d2
Support linking onnxruntime lib statically on Linux ( #326 )
2023-09-21 10:15:42 +08:00
keanu
bd173b27cc
Offline decode support multi threads ( #306 )
...
Co-authored-by: cuidongcai1035 <cuidongcai1035@wezhuiyi.com >
2023-09-19 21:04:13 +08:00
Fangjun Kuang
c471423125
Add Silero VAD ( #313 )
2023-09-17 14:54:38 +08:00
Wei Kang
47184f9db7
Refactor hotwords,support loading hotwords from file ( #296 )
2023-09-14 19:33:17 +08:00
Fangjun Kuang
6038e2aa62
Support streaming paraformer ( #263 )
2023-08-14 10:32:14 +08:00
Fangjun Kuang
a4bff28e21
Support TDNN models from the yesno recipe from icefall ( #262 )
2023-08-12 19:50:22 +08:00
Fangjun Kuang
79c2ce5dd4
Refactor online recognizer ( #250 )
...
* Refactor online recognizer.
Make it easier to support other streaming models.
Note that it is a breaking change for the Python API.
`sherpa_onnx.OnlineRecognizer()` used before should be
replaced by `sherpa_onnx.OnlineRecognizer.from_transducer()`.
2023-08-09 20:27:31 +08:00
Fangjun Kuang
6061318e3f
fix building on linux with GPU ( #249 )
2023-08-09 20:21:28 +08:00
Fangjun Kuang
45b9d4ab37
Support whisper models ( #238 )
2023-08-07 12:34:18 +08:00
Fangjun Kuang
6125d9e063
Refactor onnxruntime.cmake ( #220 )
2023-07-18 15:44:54 +08:00
Fangjun Kuang
bebc1f1398
Use static libraries for MFC examples ( #210 )
2023-07-13 14:52:43 +08:00
danfu
1c3dac9001
support streaming zipformer2 ( #185 )
...
Co-authored-by: danfu <danfu@tencent.com >
2023-06-26 11:09:43 +08:00
Wei Kang
8562711252
Implement context biasing with a Aho Corasick automata ( #145 )
...
* Implement context graph
* Modify the interface to support context biasing
* Support context biasing in modified beam search; add python wrapper
* Support context biasing in python api example
* Minor fixes
* Fix context graph
* Minor fixes
* Fix tests
* Fix style
* Fix style
* Fix comments
* Minor fixes
* Add missing header
* Replace std::shared_ptr with std::unique_ptr for effciency
* Build graph in constructor
* Fix comments
* Minor fixes
* Fix docs
2023-06-16 14:26:36 +08:00
Yuekai Zhang
b8fbf8e5ce
Add onnxruntime gpu for cmake ( #153 )
...
* add onnxruntime gpu for cmake
* fix clang
* fix typo
* cpplint
2023-05-12 22:30:47 +08:00
Fangjun Kuang
cea718e3d8
Support CoreML for macOS ( #151 )
2023-05-12 15:57:44 +08:00