Commit Graph

247 Commits

Author SHA1 Message Date
Fangjun Kuang
3d3edabb5f Add Go API for Moonshine models (#1479) 2024-10-27 09:39:09 +08:00
Fangjun Kuang
052b8645ba Add Go API examples for adding punctuations to text. (#1478) 2024-10-27 09:04:05 +08:00
Fangjun Kuang
bd4b223920 Add Kotlin and Java API for Moonshine models (#1474) 2024-10-26 22:30:29 +08:00
Fangjun Kuang
b06b460851 Begin to support https://github.com/usefulsensors/moonshine (#1470) 2024-10-26 09:51:16 +08:00
Fangjun Kuang
3d6344ead3 Fix building node-addon for Windows x86. (#1469) 2024-10-25 18:49:33 +08:00
Fangjun Kuang
707cf792c5 Add GigaAM NeMo transducer model for Russian ASR (#1467) 2024-10-25 15:20:13 +08:00
Fangjun Kuang
b41f6d2c94 Support GigaAM CTC models for Russian ASR (#1464)
See also https://github.com/salute-developers/GigaAM
2024-10-25 10:55:16 +08:00
Fangjun Kuang
a5295aad10 Handle NaN embeddings in speaker diarization. (#1461)
See also https://github.com/thewh1teagle/sherpa-rs/issues/33
2024-10-24 14:03:09 +08:00
Fangjun Kuang
b3e05f6dc4 Fix style issues (#1458) 2024-10-24 11:15:08 +08:00
Fangjun Kuang
ceb69ebd94 Add C++ API for non-streaming ASR (#1456) 2024-10-23 16:40:12 +08:00
Fangjun Kuang
effd5ef2be Add C++ API for streaming ASR. (#1455)
It is a wrapper around the C API.
2024-10-23 12:07:43 +08:00
Fangjun Kuang
e0586f1876 add more models for speaker diarization (#1440) 2024-10-17 20:03:09 +08:00
Fangjun Kuang
620597f501 Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437) 2024-10-17 11:58:14 +08:00
Fangjun Kuang
593b96758b Add Go API for offline punctuation models (#1434)
It is contributed by a community user 
from [our QQ group](https://k2-fsa.github.io/sherpa/social-groups.html#qq).
2024-10-16 17:16:47 +08:00
Fangjun Kuang
df4150dc5d Upload speaker embedding models to huggingface (#1428)
See also
https://huggingface.co/spaces/k2-fsa/speaker-diarization
2024-10-14 16:20:00 +08:00
Fangjun Kuang
5a22f74b2b Android demo for speaker diarization (#1423) 2024-10-13 14:02:57 +08:00
Fangjun Kuang
1ed803adc1 Dart API for speaker diarization (#1418) 2024-10-11 21:17:41 +08:00
Fangjun Kuang
eefc172095 JavaScript API with WebAssembly for speaker diarization (#1414)
#1408 uses [node-addon-api](https://github.com/nodejs/node-addon-api) to call C API from JavaScript, whereas this pull request uses WebAssembly to call C API from JavaScript.
2024-10-11 11:40:10 +08:00
Fangjun Kuang
1d061df355 WebAssembly exmaple for speaker diarization (#1411) 2024-10-10 22:14:45 +08:00
Fangjun Kuang
67349b52f2 JavaScript API (node-addon) for speaker diarization (#1408) 2024-10-10 15:51:31 +08:00
Fangjun Kuang
a45e5dba99 C# API for speaker diarization (#1407) 2024-10-10 14:29:05 +08:00
Fangjun Kuang
df681e9807 Go API for speaker diarization (#1403) 2024-10-09 20:10:44 +08:00
Yongzeng Liu
97654122fa docs(nodejs-addon-examples): add guide for pnpm user (#1401) 2024-10-09 18:12:41 +08:00
Fangjun Kuang
59407edcad C++ API for speaker diarization (#1396) 2024-10-09 12:01:20 +08:00
Fangjun Kuang
70165cb42d Speaker diarization example with onnxruntime Python API (#1395) 2024-10-06 16:37:29 +08:00
Fangjun Kuang
66feecb2b5 support whisper turbo (#1390) 2024-10-02 18:13:34 +08:00
Fangjun Kuang
b965f14cf0 Add Python API for clustering (#1385) 2024-09-30 11:33:15 +08:00
Fangjun Kuang
bc08160820 Export Pyannote speaker segmentation models to onnx (#1382) 2024-09-29 14:23:56 +08:00
Fangjun Kuang
11f0cb7e1c Support Parakeet models from NeMo (#1381) 2024-09-27 17:12:00 +08:00
Fangjun Kuang
12d04ce8ed Fix running MeloTTS models on GPU. (#1379)
We need to use opset 18 to export the model to onnx.
2024-09-26 16:51:43 +08:00
Fangjun Kuang
d8809b520e Fix CI errors introduced by supporting loading keywords from buffers (#1366) 2024-09-20 19:04:21 +08:00
Fangjun Kuang
576a3aa90d Add non-streaming ONNX models for Russian ASR (#1358) 2024-09-18 13:43:49 +08:00
Fangjun Kuang
cddac52780 Support passing utf-8 strings from JavaScript to C++. (#1355)
We first convert utf-16 strings to Uint8Array and then we
pass the array to C++.
2024-09-18 11:03:42 +08:00
lllwan
bf06b268d0 Fix sherpa_onnx.go (#1353) 2024-09-17 13:39:56 +08:00
Fangjun Kuang
e7ffcbd677 Add APIs about max speech duration in VAD for various programming languages (#1349) 2024-09-14 12:30:13 +08:00
Fangjun Kuang
544857b097 Fix building (#1343) 2024-09-13 13:33:52 +08:00
Fangjun Kuang
6b8877f185 Downgrade flutter sdk versions. (#1305) 2024-08-30 11:47:27 +08:00
Fangjun Kuang
c38634dfcf two-pass Android APK for SenseVoice (#1302) 2024-08-29 12:08:49 +08:00
Fangjun Kuang
9064430c3e Fix releasing wasm app for vad+asr (#1300) 2024-08-29 08:47:38 +08:00
Emmanuel Schmidbauer
a8556e31ba add Tokens []string, Timestamps []float32, Lang string, Emotion string, Event string (#1277) 2024-08-27 06:35:59 +08:00
Fangjun Kuang
17c8237ee4 Fix releasing npm package and fix building Android VAD+ASR example (#1288) 2024-08-26 10:18:48 +08:00
Fangjun Kuang
5ed8e31868 Add VAD and keyword spotting for the Node package with WebAssembly (#1286) 2024-08-24 23:05:54 +08:00
Fangjun Kuang
537e163dd0 WebAssembly example for VAD + Non-streaming ASR (#1284) 2024-08-24 13:24:52 +08:00
Fangjun Kuang
1ef8a7a202 Add WebAssembly for VAD (#1281) 2024-08-23 17:08:37 +08:00
Fangjun Kuang
fb09f8fae3 Set batch size to 1 for more streaming ASR models (#1280) 2024-08-23 11:06:55 +08:00
Fangjun Kuang
0e0d04a97a Provide models for mobile-only platforms by fixing batch size to 1 (#1276) 2024-08-22 19:36:24 +08:00
Fangjun Kuang
5a2aa110b8 Text to speech API for Object Pascal. (#1273) 2024-08-20 20:52:16 +08:00
Fangjun Kuang
63713ecbf0 Build generating subtitles APPs for more models (#1265) 2024-08-16 20:11:24 +08:00
Fangjun Kuang
fbe35ba736 Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR (#1251) 2024-08-15 22:19:45 +08:00
Fangjun Kuang
94e256244d Add blank penalty for various language bindings. (#1234) 2024-08-08 10:43:31 +08:00