Fangjun Kuang
|
7f3b9ffe5d
|
Refactor TTS Android code to support jieba for Chinese TTS models (#800)
|
2024-04-22 17:21:05 +08:00 |
|
Fangjun Kuang
|
494cb5c733
|
Fix the last character not being recognized for streaming paraformer models. (#799)
|
2024-04-22 15:10:39 +08:00 |
|
Fangjun Kuang
|
6b353bfb42
|
Add jieba for Chinese TTS models (#797)
|
2024-04-21 14:47:13 +08:00 |
|
Fangjun Kuang
|
54bc504065
|
Add Python API example for CED audio tagging. (#793)
|
2024-04-19 18:33:18 +08:00 |
|
Fangjun Kuang
|
c1608b3524
|
Support CED models (#792)
|
2024-04-19 15:20:37 +08:00 |
|
Fangjun Kuang
|
d97a283dbb
|
Add Android demo for spoken language identification using Whisper multilingual models (#783)
|
2024-04-18 14:33:59 +08:00 |
|
Fangjun Kuang
|
3a43049ba1
|
Add JNI support for spoken language identification (#782)
|
2024-04-17 19:27:15 +08:00 |
|
Fangjun Kuang
|
bcd9e48150
|
Add Android demo for audio tagging (#776)
See https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html
|
2024-04-16 20:47:16 +08:00 |
|
chiiyeh
|
aa2d695fd2
|
Add score function to speaker identification (#775)
|
2024-04-16 17:29:46 +08:00 |
|
Fangjun Kuang
|
6bf2099781
|
Fix code style issues (#774)
|
2024-04-16 09:46:15 +08:00 |
|
Fangjun Kuang
|
81b7f1d529
|
Fix display for sherpa-onnx-microphone (#773)
|
2024-04-16 09:17:23 +08:00 |
|
Manix
|
fb4aee83ac
|
Adding warm up for Zipformer2 (#766)
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>
|
2024-04-16 09:16:55 +08:00 |
|
Fangjun Kuang
|
5981adf454
|
Add Kotlin API for audio tagging (#770)
|
2024-04-15 13:49:35 +08:00 |
|
Fangjun Kuang
|
13730ecbd8
|
Add C API for punctuation (#768)
|
2024-04-14 19:02:34 +08:00 |
|
Fangjun Kuang
|
983df28a83
|
Fix a punctuation bug (#764)
|
2024-04-13 19:08:46 +08:00 |
|
Fangjun Kuang
|
68b8b88b5a
|
Add Python API for punctuation models. (#762)
|
2024-04-13 13:28:17 +08:00 |
|
Fangjun Kuang
|
329fe1aa8b
|
Support adding punctuations to the speech recogntion result (#761)
|
2024-04-13 12:15:57 +08:00 |
|
Manix
|
399d920b47
|
[feature] Configurable padding length in online websocket server (#755)
Signed-off-by: manickavela29 <manickavela1998@gmail.com>
|
2024-04-11 14:57:11 +08:00 |
|
Fangjun Kuang
|
f204e62b44
|
Add C API for audio tagging (#754)
|
2024-04-11 14:18:43 +08:00 |
|
Fangjun Kuang
|
34d70a259f
|
Add Python API and Python examples for audio tagging (#753)
|
2024-04-11 11:12:48 +08:00 |
|
AHN Sung Hwan
|
904a3cc8a9
|
Fix a bug in mean calculation of 'ys_probs' (#748)
|
2024-04-11 10:34:44 +08:00 |
|
Fangjun Kuang
|
042976ea6e
|
Add C++ microphone examples for audio tagging (#749)
|
2024-04-10 21:00:35 +08:00 |
|
Fangjun Kuang
|
f20291cadc
|
Support audio tagging using zipformer (#747)
|
2024-04-10 14:47:06 +08:00 |
|
Fangjun Kuang
|
0d90b34e4a
|
Support Chinese heteronyms on Android for TTS. (#742)
|
2024-04-08 21:36:47 +08:00 |
|
Fangjun Kuang
|
6b3d2b87f9
|
Fix releasing GIL (#741)
|
2024-04-08 17:22:48 +08:00 |
|
Fangjun Kuang
|
6fb8ceda57
|
Add VAD examples using ALSA for recording (#739)
|
2024-04-08 16:41:01 +08:00 |
|
Fangjun Kuang
|
a5f8fbc83f
|
Support heteronyms in Chinese TTS (#738)
|
2024-04-08 11:01:30 +08:00 |
|
Fangjun Kuang
|
c1c0f5bafd
|
return timestamps for WebAssembly (#737)
|
2024-04-05 20:24:27 +08:00 |
|
Fangjun Kuang
|
dbff2eaadb
|
Add C API for streaming HLG decoding (#734)
|
2024-04-05 10:31:20 +08:00 |
|
Fangjun Kuang
|
db67e00c77
|
Add HLG decoding for streaming CTC models (#731)
|
2024-04-03 21:31:42 +08:00 |
|
Fangjun Kuang
|
2e0bccad36
|
Add C API for speaker embedding extractor. (#711)
|
2024-03-28 18:05:40 +08:00 |
|
Leo Huang
|
638f48f47a
|
Added progress for callback of tts generator (#712)
Co-authored-by: leohwang <leohwang@360converter.com>
|
2024-03-28 17:12:20 +08:00 |
|
longshiming
|
de655e838e
|
delete incorrect logs (#714)
Co-authored-by: longshiming <longshiming@greesoft.com>
|
2024-03-28 10:49:45 +08:00 |
|
Fangjun Kuang
|
a042f44076
|
Add Golang API for spoken language identification. (#709)
|
2024-03-27 19:40:25 +08:00 |
|
Fangjun Kuang
|
69c7880c4d
|
Add Golang API for VAD (#708)
|
2024-03-27 12:09:39 +08:00 |
|
Fangjun Kuang
|
4e040c596e
|
Support including TTS conditionally. (#699)
|
2024-03-26 17:21:35 +08:00 |
|
Fangjun Kuang
|
d364610605
|
Use a single thread when loading models (#703)
|
2024-03-26 13:35:33 +08:00 |
|
Fangjun Kuang
|
ab7cff2513
|
Add C API for spoken language identification. (#695)
|
2024-03-25 15:16:47 +08:00 |
|
Fangjun Kuang
|
0d258dd150
|
Support spoken language identification with whisper (#694)
|
2024-03-24 22:57:00 +08:00 |
|
Fangjun Kuang
|
1952772654
|
Add timestamps and tokens for .Net's online models. (#690)
|
2024-03-23 18:51:56 +08:00 |
|
Karel Vesely
|
eaec4c83c2
|
Configurable low_freq high_freq, dithering (#664)
|
2024-03-22 21:41:44 +08:00 |
|
Fangjun Kuang
|
c8770aec20
|
Add nuget package for Windows x86 (#683)
|
2024-03-21 14:57:01 +08:00 |
|
Fangjun Kuang
|
acf0975153
|
Support whisper language/task in various language bindings. (#679)
|
2024-03-20 16:43:35 +08:00 |
|
Viggo
|
842d04d7ae
|
support whisper language (#678)
|
2024-03-20 10:16:22 +08:00 |
|
Bhaswati Saha
|
fda614d0d1
|
beam search value as parameter in offline_recognizer.py (#673)
Co-authored-by: bhascns <bhaswati@mihup.com>
|
2024-03-18 18:43:05 +08:00 |
|
Lovemefan
|
009ed2cd30
|
add WebAssembly for Kws (#648)
|
2024-03-11 21:02:31 +08:00 |
|
xinhecuican
|
f43139e803
|
c++ api for keyword spotter (#642)
|
2024-03-11 10:23:46 +08:00 |
|
Fangjun Kuang
|
3232dff2cf
|
Support user provided data in tts callback. (#653)
|
2024-03-09 18:15:03 +08:00 |
|
GaryLaurenceauAva
|
ac43c2d7b6
|
Expose 'language' 'task' 'tailPaddings' in OfflineWhisperModelConfig (#643)
Co-authored-by: Gary <gary.laurenceau@gmail.com>
|
2024-03-08 19:52:30 +08:00 |
|
Fangjun Kuang
|
d3287f9494
|
Add Python ASR examples with alsa (#646)
|
2024-03-08 11:34:48 +08:00 |
|