Commit Graph

874 Commits

Author SHA1 Message Date
Fangjun Kuang
3d3edabb5f Add Go API for Moonshine models (#1479) 2024-10-27 09:39:09 +08:00
Fangjun Kuang
052b8645ba Add Go API examples for adding punctuations to text. (#1478) 2024-10-27 09:04:05 +08:00
Fangjun Kuang
4a4659aa4f Add Swift API for Moonshine models. (#1477) 2024-10-27 08:19:01 +08:00
Fangjun Kuang
2ca2985d04 Add C and C++ API for Moonshine models (#1476) 2024-10-26 23:24:46 +08:00
Fangjun Kuang
bd4b223920 Add Kotlin and Java API for Moonshine models (#1474) 2024-10-26 22:30:29 +08:00
Fangjun Kuang
669f5ef441 Add C++ runtime and Python APIs for Moonshine models (#1473) 2024-10-26 14:34:07 +08:00
Fangjun Kuang
0f2732e4e8 Publish pre-built JNI libs for Linux aarch64 (#1472) 2024-10-26 09:59:18 +08:00
Fangjun Kuang
b06b460851 Begin to support https://github.com/usefulsensors/moonshine (#1470) 2024-10-26 09:51:16 +08:00
Fangjun Kuang
3d6344ead3 Fix building node-addon for Windows x86. (#1469) 2024-10-25 18:49:33 +08:00
Fangjun Kuang
d5a2f52413 Release v1.10.29 (#1468) 2024-10-25 15:50:42 +08:00
Fangjun Kuang
707cf792c5 Add GigaAM NeMo transducer model for Russian ASR (#1467) 2024-10-25 15:20:13 +08:00
Fangjun Kuang
b41f6d2c94 Support GigaAM CTC models for Russian ASR (#1464)
See also https://github.com/salute-developers/GigaAM
2024-10-25 10:55:16 +08:00
Peakyxh
2b40079faf Add speaker identification with VAD and non-streaming ASR using ALSA (#1463) 2024-10-24 22:04:51 +08:00
Fangjun Kuang
a5295aad10 Handle NaN embeddings in speaker diarization. (#1461)
See also https://github.com/thewh1teagle/sherpa-rs/issues/33
2024-10-24 14:03:09 +08:00
Fangjun Kuang
b3e05f6dc4 Fix style issues (#1458) 2024-10-24 11:15:08 +08:00
Fangjun Kuang
ceb69ebd94 Add C++ API for non-streaming ASR (#1456) 2024-10-23 16:40:12 +08:00
Fangjun Kuang
effd5ef2be Add C++ API for streaming ASR. (#1455)
It is a wrapper around the C API.
2024-10-23 12:07:43 +08:00
JameWade
3edd8d7cf6 add java android demo (#1454) 2024-10-23 11:38:26 +08:00
YeyuchenBa
bcaa91ed36 update java for hotword jar (#1444)
---------

Co-authored-by: root <1552138571@qq.com>
2024-10-18 18:07:51 +08:00
Fangjun Kuang
1af8ad89e6 Add Java API example for hotwords. (#1442) 2024-10-18 16:35:31 +08:00
Fangjun Kuang
e0586f1876 add more models for speaker diarization (#1440) 2024-10-17 20:03:09 +08:00
Zazzle516
4783c8f590 fix "log10" compile error by import CMATH lib (#1438) 2024-10-17 14:50:04 +08:00
Fangjun Kuang
620597f501 Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437) 2024-10-17 11:58:14 +08:00
lxiao336
471cbd83c6 updated onnxruntime-linux-aarch64.cmake so that libonnxruntime.so can be found for specific aarch64 cross-compilation environments (#1436)
Co-authored-by: xiao <shawl336@163.com>
2024-10-16 22:42:42 +08:00
Fangjun Kuang
593b96758b Add Go API for offline punctuation models (#1434)
It is contributed by a community user 
from [our QQ group](https://k2-fsa.github.io/sherpa/social-groups.html#qq).
2024-10-16 17:16:47 +08:00
semxum
77dd5f73fc Update README.md (#1431) 2024-10-14 18:25:27 +08:00
Fangjun Kuang
df4150dc5d Upload speaker embedding models to huggingface (#1428)
See also
https://huggingface.co/spaces/k2-fsa/speaker-diarization
2024-10-14 16:20:00 +08:00
Fangjun Kuang
99f320b893 Release v1.10.28 (#1424) 2024-10-13 15:27:38 +08:00
Fangjun Kuang
5a22f74b2b Android demo for speaker diarization (#1423) 2024-10-13 14:02:57 +08:00
Fangjun Kuang
94b26ff07c Android JNI support for speaker diarization (#1421) 2024-10-12 13:03:48 +08:00
Fangjun Kuang
5e273c5be4 Pascal API for speaker diarization (#1420) 2024-10-12 12:28:38 +08:00
Fangjun Kuang
1ed803adc1 Dart API for speaker diarization (#1418) 2024-10-11 21:17:41 +08:00
Fangjun Kuang
1851ff6337 Java API for speaker diarization (#1416) 2024-10-11 16:51:40 +08:00
Fangjun Kuang
2d412b1190 Kotlin API for speaker diarization (#1415) 2024-10-11 14:41:53 +08:00
Fangjun Kuang
eefc172095 JavaScript API with WebAssembly for speaker diarization (#1414)
#1408 uses [node-addon-api](https://github.com/nodejs/node-addon-api) to call C API from JavaScript, whereas this pull request uses WebAssembly to call C API from JavaScript.
2024-10-11 11:40:10 +08:00
Fangjun Kuang
f1b311ee4f Handle audio files less than 10s long for speaker diarization. (#1412)
If the input audio file is less than 10 seconds long, there is only 
one chunk, and there is no need to compute embeddings or 
do clustering.

We can use the segmentation result from the speaker segmentation 
model directly.
2024-10-11 10:27:16 +08:00
Fangjun Kuang
1d061df355 WebAssembly exmaple for speaker diarization (#1411) 2024-10-10 22:14:45 +08:00
Fangjun Kuang
67349b52f2 JavaScript API (node-addon) for speaker diarization (#1408) 2024-10-10 15:51:31 +08:00
Fangjun Kuang
a45e5dba99 C# API for speaker diarization (#1407) 2024-10-10 14:29:05 +08:00
Fangjun Kuang
bd50e79590 Update readme to include more external projects using sherpa-onnx (#1405) 2024-10-10 10:27:14 +08:00
Fangjun Kuang
1571344509 Swift API for speaker diarization (#1404) 2024-10-09 23:25:39 +08:00
Fangjun Kuang
df681e9807 Go API for speaker diarization (#1403) 2024-10-09 20:10:44 +08:00
Yongzeng Liu
97654122fa docs(nodejs-addon-examples): add guide for pnpm user (#1401) 2024-10-09 18:12:41 +08:00
Fangjun Kuang
d468527f62 C API for speaker diarization (#1402) 2024-10-09 17:10:03 +08:00
Fangjun Kuang
8535b1d3bb Python API for speaker diarization. (#1400) 2024-10-09 14:13:26 +08:00
Fangjun Kuang
59407edcad C++ API for speaker diarization (#1396) 2024-10-09 12:01:20 +08:00
Fangjun Kuang
70165cb42d Speaker diarization example with onnxruntime Python API (#1395) 2024-10-06 16:37:29 +08:00
Askars
5f50cbf65a context_state is not set correctly when previous context is passed after reset (#1393)
Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv>
2024-10-03 16:42:09 +08:00
Fangjun Kuang
66feecb2b5 support whisper turbo (#1390) 2024-10-02 18:13:34 +08:00
Fangjun Kuang
b965f14cf0 Add Python API for clustering (#1385) 2024-09-30 11:33:15 +08:00