Fangjun Kuang
31d6206fde
HarmonyOS support for VAD. ( #1561 )
2024-11-24 16:29:24 +08:00
Fangjun Kuang
f97daed408
Fixes #1512 ( #1522 )
2024-11-08 21:07:36 +08:00
Fangjun Kuang
669f5ef441
Add C++ runtime and Python APIs for Moonshine models ( #1473 )
2024-10-26 14:34:07 +08:00
Fangjun Kuang
b3e05f6dc4
Fix style issues ( #1458 )
2024-10-24 11:15:08 +08:00
Fangjun Kuang
59407edcad
C++ API for speaker diarization ( #1396 )
2024-10-09 12:01:20 +08:00
Fangjun Kuang
70568c2df7
Support Agglomerative clustering. ( #1384 )
...
We use the open-source implementation from
https://github.com/cdalitz/hclust-cpp
2024-09-29 23:44:29 +08:00
jianyou
1414e4dc61
Add online punctuation and casing prediction model for English language ( #1224 )
2024-08-06 17:33:38 +08:00
Fangjun Kuang
d5f486878d
Remove libonnxruntime_providers_cuda.so as a dependency. ( #1210 )
2024-08-03 16:25:23 +08:00
Fangjun Kuang
25f0a10468
Add C++ runtime for SenseVoice models ( #1148 )
2024-07-18 22:54:18 +08:00
Fangjun Kuang
960eb7529e
Add C++ runtime for MeloTTS ( #1138 )
2024-07-16 15:55:02 +08:00
Fangjun Kuang
a25075101c
Build sherpa-onnx as a single shared library ( #1078 )
...
When `-D BUILD_SHARED_LIBS=ON` is passed to `cmake`, it builds a single shared library.
Specifically,
- For C APIs, it builds `libsherpa-onnx-c-api.so`
- For Python APIs, it builds `_sherpa_onnx.cpython-xx-xx.so`
- For Kotlin and Java APIs, it builds `libsherpa-onnx-jni.so`
There is no `libsherpa-onnx-core.so` any longer.
Note it affects only shared libraries.
2024-07-06 16:41:54 +08:00
Manix
55decb7bee
Add config for TensorRT and CUDA execution provider ( #992 )
...
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com >
Signed-off-by: manickavela1998@gmail.com <manickavela.arumugam@uniphore.com >
2024-07-05 15:18:37 +08:00
Fangjun Kuang
598c12c4e5
Fix CI tests ( #1061 )
2024-06-27 18:05:18 +08:00
Fangjun Kuang
a11c859971
Support clang-tidy ( #1034 )
2024-06-19 20:51:57 +08:00
Fangjun Kuang
6789c909d2
Inverse text normalization API of streaming ASR for various programming languages ( #1022 )
2024-06-18 13:42:17 +08:00
Fangjun Kuang
fd5a0d1e00
Add C++ runtime for Tele-AI/TeleSpeech-ASR ( #970 )
2024-06-05 00:26:40 +08:00
Sangeet Sagar
3f472a9993
Add C++ runtime for *streaming* faster conformer transducer from NeMo. ( #889 )
...
Co-authored-by: sangeet2020 <15uec053@gmail.com >
2024-05-30 13:55:03 +08:00
Wei Kang
b012b78ceb
Encode hotwords in C++ side ( #828 )
...
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Fangjun Kuang
46e4e5b7ac
Add C++ support for streaming NeMo CTC models. ( #857 )
2024-05-10 16:26:43 +08:00
Fangjun Kuang
17cd3a5f01
Add C++ runtime for non-streaming faster conformer transducer from NeMo. ( #854 )
2024-05-10 12:15:39 +08:00
Fangjun Kuang
6b353bfb42
Add jieba for Chinese TTS models ( #797 )
2024-04-21 14:47:13 +08:00
Fangjun Kuang
c1608b3524
Support CED models ( #792 )
2024-04-19 15:20:37 +08:00
Fangjun Kuang
329fe1aa8b
Support adding punctuations to the speech recogntion result ( #761 )
2024-04-13 12:15:57 +08:00
Fangjun Kuang
042976ea6e
Add C++ microphone examples for audio tagging ( #749 )
2024-04-10 21:00:35 +08:00
Fangjun Kuang
f20291cadc
Support audio tagging using zipformer ( #747 )
2024-04-10 14:47:06 +08:00
Fangjun Kuang
6fb8ceda57
Add VAD examples using ALSA for recording ( #739 )
2024-04-08 16:41:01 +08:00
Fangjun Kuang
a5f8fbc83f
Support heteronyms in Chinese TTS ( #738 )
2024-04-08 11:01:30 +08:00
Fangjun Kuang
db67e00c77
Add HLG decoding for streaming CTC models ( #731 )
2024-04-03 21:31:42 +08:00
Fangjun Kuang
4e040c596e
Support including TTS conditionally. ( #699 )
2024-03-26 17:21:35 +08:00
Fangjun Kuang
0d258dd150
Support spoken language identification with whisper ( #694 )
2024-03-24 22:57:00 +08:00
Wei Kang
734bbd91dc
Add Python API for keyword spotting ( #576 )
...
* Add alsa & microphone support for keyword spotting
* Add python wrapper
2024-03-01 09:31:11 +08:00
Fangjun Kuang
87a7030c08
Support using alsa to access the microphone with non-streaming ASR models ( #517 )
2024-02-26 21:17:26 +08:00
Fangjun Kuang
67acd34dcd
Use alsa to read microphone in speaker identification demo. ( #605 )
2024-02-23 19:27:51 +08:00
Fangjun Kuang
099a0ccae3
Link the math lib. ( #592 )
2024-02-21 15:36:54 +08:00
Fangjun Kuang
d771762868
Support WebAssembly for text-to-speech ( #577 )
2024-02-08 23:39:12 +08:00
Fangjun Kuang
0b18ccfbb2
C++ API demo for speaker identification with portaudio. ( #561 )
2024-01-30 11:21:43 +08:00
Wei Kang
b6c020901a
decoder for open vocabulary keyword spotting ( #505 )
...
* various fixes to ContextGraph to support open vocabulary keywords decoder
* Add keyword spotter runtime
* Add binary
* First version works
* Minor fixes
* update text2token
* default values
* Add jni for kws
* add kws android project
* Minor fixes
* Remove unused interface
* Minor fixes
* Add workflow
* handle extra info in texts
* Minor fixes
* Add more comments
* Fix ci
* fix cpp style
* Add input box in android demo so that users can specify their keywords
* Fix cpp style
* Fix comments
* Minor fixes
* Minor fixes
* minor fixes
* Minor fixes
* Minor fixes
* Add CI
* Fix code style
* cpplint
* Fix comments
* Fix error
2024-01-20 22:52:41 +08:00
Fangjun Kuang
2024e96639
Add C++ runtime for speaker verification models from NeMo ( #527 )
2024-01-13 21:42:09 +08:00
Fangjun Kuang
afc81ec122
Add C++ runtime for models from 3d-speaker ( #523 )
2024-01-11 19:10:30 +08:00
Fangjun Kuang
55266918c8
Add runtime support for wespeaker models ( #516 )
2024-01-09 22:06:08 +08:00
Fangjun Kuang
e475e750ac
Support streaming zipformer CTC ( #496 )
...
* Support streaming zipformer CTC
* test online zipformer2 CTC
* Update doc of sherpa-onnx.cc
* Add Python APIs for streaming zipformer2 ctc
* Add Python API examples for streaming zipformer2 ctc
* Swift API for streaming zipformer2 CTC
* NodeJS API for streaming zipformer2 CTC
* Kotlin API for streaming zipformer2 CTC
* Golang API for streaming zipformer2 CTC
* C# API for streaming zipformer2 CTC
* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
b18812ceff
Play generated audio using alsa for TTS ( #482 )
2023-12-13 22:28:03 +08:00
Fangjun Kuang
d34161413d
Support Ukrainian VITS models from coqui-ai/TTS ( #469 )
2023-12-06 19:37:11 +08:00
Fangjun Kuang
99ff6a834c
Play generated audio as it is generating. ( #457 )
2023-12-02 15:35:11 +08:00
Fangjun Kuang
62dc3c3e46
Use piper-phonemize to convert text to token IDs ( #453 )
2023-11-30 23:57:43 +08:00
Fangjun Kuang
db41778e99
Support piper-phonemize ( #452 )
2023-11-28 19:12:58 +08:00
Fangjun Kuang
fac4f6bc7c
Support streaming conformer CTC models from wenet ( #427 )
2023-11-16 10:35:23 +08:00
Fangjun Kuang
b83b3e3cd1
Support non-streaming WeNet CTC models. ( #426 )
2023-11-15 14:23:20 +08:00
Fangjun Kuang
68f0e59688
Add a C++ example to show streaming VAD + non-streaming ASR. ( #420 )
2023-11-11 22:54:27 +08:00
Fangjun Kuang
b80b7e5144
Support linking onnxruntime statically for macOS ( #403 )
2023-10-31 20:24:43 +08:00