Commit Graph

525 Commits

Author SHA1 Message Date
Fangjun Kuang
83cd533f67 Add Java API for non-streaming ASR (#807) 2024-04-24 21:03:26 +08:00
Fangjun Kuang
c3a2e8a67c Refactor Java API (#806) 2024-04-24 18:41:48 +08:00
Fangjun Kuang
c7691650d7 Fix CI tests (#804) 2024-04-24 13:01:06 +08:00
Fangjun Kuang
9b67a476e6 Refactor the JNI interface to make it more modular and maintainable (#802) 2024-04-24 09:48:42 +08:00
布宝
dc5af04830 wget 续传 (#801) 2024-04-22 20:19:08 +08:00
Fangjun Kuang
7f3b9ffe5d Refactor TTS Android code to support jieba for Chinese TTS models (#800) 2024-04-22 17:21:05 +08:00
Fangjun Kuang
494cb5c733 Fix the last character not being recognized for streaming paraformer models. (#799) 2024-04-22 15:10:39 +08:00
Fangjun Kuang
9a68b92ce6 Increase CED's max frame length to 3000 (#798)
so that it can process waves for up to 30 seconds.
2024-04-22 10:18:47 +08:00
Fangjun Kuang
6b353bfb42 Add jieba for Chinese TTS models (#797) 2024-04-21 14:47:13 +08:00
Fangjun Kuang
2e0ee0e8c8 fix a typo in building language ID apk (#795) 2024-04-19 20:16:48 +08:00
Fangjun Kuang
37831fe89c Release v1.9.22 (#794) 2024-04-19 18:37:47 +08:00
Fangjun Kuang
54bc504065 Add Python API example for CED audio tagging. (#793) 2024-04-19 18:33:18 +08:00
Fangjun Kuang
c1608b3524 Support CED models (#792) 2024-04-19 15:20:37 +08:00
Fangjun Kuang
d97a283dbb Add Android demo for spoken language identification using Whisper multilingual models (#783) 2024-04-18 14:33:59 +08:00
Fangjun Kuang
3a43049ba1 Add JNI support for spoken language identification (#782) 2024-04-17 19:27:15 +08:00
Fangjun Kuang
69440e481f Add WearOS demo for audio tagging (#777) 2024-04-17 12:22:17 +08:00
Fangjun Kuang
bcd9e48150 Add Android demo for audio tagging (#776)
See https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html
2024-04-16 20:47:16 +08:00
chiiyeh
aa2d695fd2 Add score function to speaker identification (#775) 2024-04-16 17:29:46 +08:00
Fangjun Kuang
6bf2099781 Fix code style issues (#774) 2024-04-16 09:46:15 +08:00
Fangjun Kuang
81b7f1d529 Fix display for sherpa-onnx-microphone (#773) 2024-04-16 09:17:23 +08:00
Manix
fb4aee83ac Adding warm up for Zipformer2 (#766)
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>
2024-04-16 09:16:55 +08:00
Fangjun Kuang
5981adf454 Add Kotlin API for audio tagging (#770) 2024-04-15 13:49:35 +08:00
Fangjun Kuang
13730ecbd8 Add C API for punctuation (#768) 2024-04-14 19:02:34 +08:00
gtf35
b0265b258d Replace torchaudio with soundfile in python-api-examples (#765) 2024-04-13 23:39:07 +08:00
Fangjun Kuang
983df28a83 Fix a punctuation bug (#764) 2024-04-13 19:08:46 +08:00
Fangjun Kuang
b6ad0436fa Release v1.9.18 (#763) 2024-04-13 16:34:15 +08:00
Fangjun Kuang
68b8b88b5a Add Python API for punctuation models. (#762) 2024-04-13 13:28:17 +08:00
Fangjun Kuang
329fe1aa8b Support adding punctuations to the speech recogntion result (#761) 2024-04-13 12:15:57 +08:00
Fangjun Kuang
0f4705f775 Fix WASM for kws (#758) 2024-04-12 18:57:21 +08:00
Fangjun Kuang
be4a2488a8 Use batch size 1 in generating subtitles. (#756) 2024-04-11 15:58:11 +08:00
Manix
399d920b47 [feature] Configurable padding length in online websocket server (#755)
Signed-off-by: manickavela29 <manickavela1998@gmail.com>
2024-04-11 14:57:11 +08:00
Fangjun Kuang
f204e62b44 Add C API for audio tagging (#754) 2024-04-11 14:18:43 +08:00
Fangjun Kuang
34d70a259f Add Python API and Python examples for audio tagging (#753) 2024-04-11 11:12:48 +08:00
AHN Sung Hwan
904a3cc8a9 Fix a bug in mean calculation of 'ys_probs' (#748) 2024-04-11 10:34:44 +08:00
布宝
d21c45d0ea Add --continue to wget (#750)
Also, switch to github mirror
2024-04-11 09:07:31 +08:00
Fangjun Kuang
042976ea6e Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00
Fangjun Kuang
f20291cadc Support audio tagging using zipformer (#747) 2024-04-10 14:47:06 +08:00
Fangjun Kuang
c9ae7595d5 Fix go API examples with portaudio on Windows. (#746) 2024-04-10 09:56:35 +08:00
Fangjun Kuang
db1b3ab1f3 Fix building OpenFst on Windows. (#744) 2024-04-09 11:17:46 +08:00
Fangjun Kuang
0d90b34e4a Support Chinese heteronyms on Android for TTS. (#742) 2024-04-08 21:36:47 +08:00
Fangjun Kuang
6b3d2b87f9 Fix releasing GIL (#741) 2024-04-08 17:22:48 +08:00
Fangjun Kuang
6fb8ceda57 Add VAD examples using ALSA for recording (#739) 2024-04-08 16:41:01 +08:00
Fangjun Kuang
a5f8fbc83f Support heteronyms in Chinese TTS (#738) 2024-04-08 11:01:30 +08:00
Fangjun Kuang
c1c0f5bafd return timestamps for WebAssembly (#737) 2024-04-05 20:24:27 +08:00
Fangjun Kuang
dbff2eaadb Add C API for streaming HLG decoding (#734) 2024-04-05 10:31:20 +08:00
Fangjun Kuang
db67e00c77 Add HLG decoding for streaming CTC models (#731) 2024-04-03 21:31:42 +08:00
yujinqiu
f8832cb5f2 Add language identification swiftui demo (#729) 2024-04-01 20:34:14 +08:00
yujinqiu
fabd30e3bb Fix microphone privacy config (#727) 2024-04-01 14:59:40 +08:00
Fangjun Kuang
3acf373b07 add more piper models (#725) 2024-04-01 11:39:52 +08:00
Fangjun Kuang
2ededa7e98 Fix building wasm in CI (#720) 2024-03-31 20:50:56 +08:00