Fangjun Kuang
2d0869c709
Fix style issues ( #1718 )
2025-01-16 15:43:51 +08:00
Fangjun Kuang
ffc6b480a0
Add C++ and Python API for Kokoro TTS models. ( #1715 )
2025-01-16 14:24:51 +08:00
Fangjun Kuang
cbe07ac1b6
Release v1.10.39 ( #1702 )
2025-01-13 10:28:05 +08:00
Fangjun Kuang
1fe5fe495f
Add Android demo for MatchaTTS models. ( #1683 )
2025-01-06 06:44:09 +08:00
Fangjun Kuang
bf3330c906
Add HarmonyOS examples for MatchaTTS. ( #1678 )
2025-01-03 17:09:29 +08:00
Fangjun Kuang
9aa4897a9e
Add C API for MatchaTTS models ( #1675 )
2025-01-03 12:17:26 +08:00
Fangjun Kuang
a00d3b4821
Add Java API for Matcha-TTS models. ( #1673 )
2025-01-02 15:15:30 +08:00
Fangjun Kuang
f457baea42
Support Matcha-TTS models using espeak-ng ( #1672 )
2025-01-02 13:46:43 +08:00
Fangjun Kuang
3422b9388d
Add Kotlin API for Matcha-TTS models. ( #1668 )
2024-12-31 19:20:52 +08:00
Fangjun Kuang
ebe92e523d
Remove spaces after punctuations for TTS ( #1666 )
2024-12-31 16:06:27 +08:00
Fangjun Kuang
2c2926af7d
Add C++ runtime for Matcha-TTS ( #1627 )
2024-12-31 12:44:14 +08:00
Fangjun Kuang
b6f0f5fc2e
Support removing invalid utf-8 sequences. ( #1648 )
2024-12-25 19:32:13 +08:00
Fangjun Kuang
d00d1c6298
Fix GitHub actions. ( #1642 )
2024-12-24 11:34:35 +08:00
Fangjun Kuang
b76cd9033a
Support decoding with byte-level BPE (bbpe) models. ( #1633 )
2024-12-20 19:21:32 +08:00
Fangjun Kuang
1bae4085ca
Add speaker diarization API for HarmonyOS. ( #1609 )
2024-12-10 16:03:03 +08:00
Fangjun Kuang
314545f938
Add speaker identification APIs for HarmonyOS ( #1607 )
...
* Add speaker embedding extractor API for HarmonyOS
* Add ArkTS API for speaker identification
2024-12-09 19:23:18 +08:00
Fangjun Kuang
a743a4400f
Add on-device real-time ASR demo for HarmonyOS ( #1606 )
2024-12-09 16:40:15 +08:00
Fangjun Kuang
74a8735f7a
Add on-device tex-to-speech (TTS) demo for HarmonyOS ( #1590 )
2024-12-04 14:27:12 +08:00
Fangjun Kuang
dc3287f3a8
Add HarmonyOS support for text-to-speech. ( #1584 )
2024-12-01 21:43:34 +08:00
Fangjun Kuang
109fb799ca
fix building for Android ( #1568 )
2024-11-27 10:36:16 +08:00
Fangjun Kuang
2101227269
Add streaming ASR support for HarmonyOS. ( #1565 )
2024-11-26 18:36:56 +08:00
Fangjun Kuang
298b6b6fda
Add non-streaming ASR support for HarmonyOS. ( #1564 )
2024-11-26 16:38:35 +08:00
Fangjun Kuang
31d6206fde
HarmonyOS support for VAD. ( #1561 )
2024-11-24 16:29:24 +08:00
Fangjun Kuang
f97daed408
Fixes #1512 ( #1522 )
2024-11-08 21:07:36 +08:00
Fangjun Kuang
4eeb336f59
Export the English TTS model from MeloTTS ( #1509 )
2024-11-04 07:54:19 +08:00
Fangjun Kuang
6ee8c99c5d
Fix building ( #1508 )
2024-11-03 19:47:04 +08:00
Fangjun Kuang
9ab89c33bc
Support building GPU-capable sherpa-onnx on Linux aarch64. ( #1500 )
...
Thanks to @Peakyxh for providing pre-built onnxruntime libraries
with CUDA support for Linux aarch64.
Tested on Jetson nano b01
2024-11-01 11:16:28 +08:00
Fangjun Kuang
9fa3bc40d7
Fix reading tokens.txt on Windows. ( #1497 )
2024-10-30 12:13:11 +08:00
Fangjun Kuang
669f5ef441
Add C++ runtime and Python APIs for Moonshine models ( #1473 )
2024-10-26 14:34:07 +08:00
Fangjun Kuang
707cf792c5
Add GigaAM NeMo transducer model for Russian ASR ( #1467 )
2024-10-25 15:20:13 +08:00
Fangjun Kuang
b41f6d2c94
Support GigaAM CTC models for Russian ASR ( #1464 )
...
See also https://github.com/salute-developers/GigaAM
2024-10-25 10:55:16 +08:00
Fangjun Kuang
a5295aad10
Handle NaN embeddings in speaker diarization. ( #1461 )
...
See also https://github.com/thewh1teagle/sherpa-rs/issues/33
2024-10-24 14:03:09 +08:00
Fangjun Kuang
b3e05f6dc4
Fix style issues ( #1458 )
2024-10-24 11:15:08 +08:00
Fangjun Kuang
ceb69ebd94
Add C++ API for non-streaming ASR ( #1456 )
2024-10-23 16:40:12 +08:00
Zazzle516
4783c8f590
fix "log10" compile error by import CMATH lib ( #1438 )
2024-10-17 14:50:04 +08:00
Fangjun Kuang
94b26ff07c
Android JNI support for speaker diarization ( #1421 )
2024-10-12 13:03:48 +08:00
Fangjun Kuang
1ed803adc1
Dart API for speaker diarization ( #1418 )
2024-10-11 21:17:41 +08:00
Fangjun Kuang
2d412b1190
Kotlin API for speaker diarization ( #1415 )
2024-10-11 14:41:53 +08:00
Fangjun Kuang
f1b311ee4f
Handle audio files less than 10s long for speaker diarization. ( #1412 )
...
If the input audio file is less than 10 seconds long, there is only
one chunk, and there is no need to compute embeddings or
do clustering.
We can use the segmentation result from the speaker segmentation
model directly.
2024-10-11 10:27:16 +08:00
Fangjun Kuang
1d061df355
WebAssembly exmaple for speaker diarization ( #1411 )
2024-10-10 22:14:45 +08:00
Fangjun Kuang
d468527f62
C API for speaker diarization ( #1402 )
2024-10-09 17:10:03 +08:00
Fangjun Kuang
8535b1d3bb
Python API for speaker diarization. ( #1400 )
2024-10-09 14:13:26 +08:00
Fangjun Kuang
59407edcad
C++ API for speaker diarization ( #1396 )
2024-10-09 12:01:20 +08:00
Fangjun Kuang
70165cb42d
Speaker diarization example with onnxruntime Python API ( #1395 )
2024-10-06 16:37:29 +08:00
Askars
5f50cbf65a
context_state is not set correctly when previous context is passed after reset ( #1393 )
...
Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv >
2024-10-03 16:42:09 +08:00
Fangjun Kuang
b965f14cf0
Add Python API for clustering ( #1385 )
2024-09-30 11:33:15 +08:00
Fangjun Kuang
70568c2df7
Support Agglomerative clustering. ( #1384 )
...
We use the open-source implementation from
https://github.com/cdalitz/hclust-cpp
2024-09-29 23:44:29 +08:00
Fangjun Kuang
11f0cb7e1c
Support Parakeet models from NeMo ( #1381 )
2024-09-27 17:12:00 +08:00
lxiao336
06b61ccad8
Allow more online models to load tokens file from the memory ( #1352 )
...
Co-authored-by: xiao <shawl336@6163.com >
2024-09-20 16:38:41 +08:00
Fangjun Kuang
1423ddb1f0
Support specifying max speech duration for VAD. ( #1348 )
2024-09-14 10:57:46 +08:00