Fangjun Kuang
028b8f2718
Add C++ example for streaming ASR with SenseVoice. ( #2199 )
2025-05-11 00:23:32 +08:00
Fangjun Kuang
53518efd2f
Add real-time speech recognition example for SenseVoice. ( #2197 )
2025-05-10 00:50:40 +08:00
Fangjun Kuang
4a833a7547
Fix displaying streaming speech recognition results for Python. ( #2196 )
2025-05-09 21:48:49 +08:00
Fangjun Kuang
51f8824219
Add homonphone replacer example for Python API. ( #2161 )
2025-04-29 15:59:34 +08:00
Fangjun Kuang
f64c58342b
Support replacing homonphonic phrases ( #2153 )
2025-04-27 15:31:11 +08:00
Fangjun Kuang
95ba6b4039
Generate subtitles with FireRedAsr models ( #2112 )
2025-04-10 10:35:24 +08:00
Fangjun Kuang
0de7e1b9f0
Add C++ and Python API for Dolphin CTC models ( #2085 )
2025-04-02 19:09:00 +08:00
Fangjun Kuang
0aacf02dd8
Add C++ runtime for vocos ( #2014 )
2025-03-17 17:05:15 +08:00
Fangjun Kuang
5d2d792b1d
Add Python API for speech enhancement GTCRN models ( #1978 )
2025-03-10 19:02:17 +08:00
luffy
4e83b3473b
speaker-identification-with-vad-non-streaming-asr.py Lack of support for sense_voice. ( #1884 )
2025-02-18 12:34:47 +08:00
Fangjun Kuang
316424b382
Add C++ and Python API for FireRedASR AED models ( #1867 )
2025-02-16 22:45:24 +08:00
JV_X
ce7c03b086
Modify the model used ( #1855 )
...
non_streaming_server.py cannot use streaming models
2025-02-13 15:08:04 +08:00
Fangjun Kuang
c84a833863
Add C++ and Python API for Kokoro 1.0 multilingual TTS model ( #1795 )
2025-02-06 22:57:13 +08:00
Fangjun Kuang
8b989a851c
Fix keyword spotting. ( #1689 )
...
Reset the stream right after detecting a keyword
2025-01-20 16:41:10 +08:00
Fangjun Kuang
ffc6b480a0
Add C++ and Python API for Kokoro TTS models. ( #1715 )
2025-01-16 14:24:51 +08:00
Fangjun Kuang
a4365dad82
Avoid adding tail padding for VAD in generate-subtitles.py ( #1674 )
2025-01-03 10:37:39 +08:00
Fangjun Kuang
f457baea42
Support Matcha-TTS models using espeak-ng ( #1672 )
2025-01-02 13:46:43 +08:00
Fangjun Kuang
2c2926af7d
Add C++ runtime for Matcha-TTS ( #1627 )
2024-12-31 12:44:14 +08:00
Fangjun Kuang
268d562135
Add TeleSpeech CTC to non_streaming_server.py ( #1649 )
2024-12-26 11:11:03 +08:00
goddamnVincent
47a2dd4cf8
'update20241203' ( #1589 )
...
add '--modeling-unit' and "--bpe-vocab" to /sherpa-onnx/python-api-examples/streaming_server.py make it specifiable.
2024-12-04 09:22:24 +08:00
JiayuXu
0d6bf52844
fix: support both old and new websockets request headers format ( #1588 )
...
Co-authored-by: xujiayu <xujiayu@kaihong.com >
2024-12-03 17:22:12 +08:00
VEP
4fab3f2e2f
Revert: [ #1521 ] No need to reset sample-buffer ( #1524 )
...
Co-authored-by: VEP <517138883@qq.com >
2024-11-08 21:28:04 +08:00
VEP
f94cca71cf
Fix: Reset sample-buffer after processing ( #1521 )
...
Co-authored-by: VEP <517138883@qq.com >
2024-11-08 19:04:34 +08:00
彭震东
72dc68c8fa
fix typo ( #1488 )
2024-10-28 21:30:18 +08:00
Fangjun Kuang
669f5ef441
Add C++ runtime and Python APIs for Moonshine models ( #1473 )
2024-10-26 14:34:07 +08:00
Peakyxh
2b40079faf
Add speaker identification with VAD and non-streaming ASR using ALSA ( #1463 )
2024-10-24 22:04:51 +08:00
Fangjun Kuang
8535b1d3bb
Python API for speaker diarization. ( #1400 )
2024-10-09 14:13:26 +08:00
Fangjun Kuang
e7ffcbd677
Add APIs about max speech duration in VAD for various programming languages ( #1349 )
2024-09-14 12:30:13 +08:00
Fangjun Kuang
1423ddb1f0
Support specifying max speech duration for VAD. ( #1348 )
2024-09-14 10:57:46 +08:00
Lim Yao Chong
3bffc24d64
Add Python binding for online punctuation models ( #1312 )
2024-09-09 10:26:53 +08:00
Fangjun Kuang
857cb5075c
Fix typos ( #1330 )
2024-09-09 10:22:42 +08:00
Fangjun Kuang
8a5f5c1999
Fix python two pass ASR examples ( #1230 )
2024-08-07 18:35:38 +08:00
Fangjun Kuang
d279c8d20e
Add more Python examples for SenseVoice ( #1179 )
2024-07-28 21:54:38 +08:00
Fangjun Kuang
25f0a10468
Add C++ runtime for SenseVoice models ( #1148 )
2024-07-18 22:54:18 +08:00
Fangjun Kuang
b5093e27f9
Fix publishing apks to huggingface ( #1121 )
...
Save APKs for each release in a separate directory.
Huggingface requires that each directory cannot contain more than 1000 files.
Since we have so many tts models and for each model we need to build APKs of 4 different ABIs,
it is a workaround for the huggingface's constraint by placing them into separate directories for different releases.
2024-07-13 16:14:00 +08:00
Fangjun Kuang
dd0ff2ca06
Support onnxruntime 1.18.0 ( #906 )
2024-07-10 17:05:26 +08:00
Fangjun Kuang
c2cc9dec58
Add Flush to VAD so that the last segment can be detected. ( #1099 )
2024-07-09 16:15:56 +08:00
Fangjun Kuang
9dd0e03568
Enable to stop TTS generation ( #1041 )
2024-06-22 18:18:36 +08:00
彭震东
96ab843173
fix typo ( #1038 )
2024-06-21 11:15:59 +08:00
愚者自愚
167bc76db0
fix generate-subtitles.py bug ( #1029 )
...
* fix generate-subtitles.py If the audio file is not muted for more than 1 second at the end, it will cause the last segment to be lost
2024-06-18 18:29:39 +08:00
Fangjun Kuang
349d957da2
Add inverse text normalization for online ASR ( #1020 )
2024-06-17 18:39:23 +08:00
Fangjun Kuang
b0f7ed3ee3
Add inverse text normalization for non-streaming ASR ( #1017 )
2024-06-17 14:28:53 +08:00
Fangjun Kuang
fc09227cd1
Add Python example to show how to register speakers dynamically for speaker ID. ( #986 )
2024-06-10 21:01:48 +08:00
Fangjun Kuang
fd5a0d1e00
Add C++ runtime for Tele-AI/TeleSpeech-ASR ( #970 )
2024-06-05 00:26:40 +08:00
Fangjun Kuang
b31b9f3a2d
Add a VAD Python example to remove silences from a file. ( #963 )
2024-06-03 16:30:28 +08:00
Fangjun Kuang
b445956675
Fix CI tests. ( #898 )
2024-05-21 20:37:29 +08:00
Wei Kang
b012b78ceb
Encode hotwords in C++ side ( #828 )
...
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Fangjun Kuang
eee5d8a15c
Add node-addon-api for VAD ( #864 )
2024-05-11 20:58:23 +08:00
Fangjun Kuang
a88b3bac21
Fix Python TTS examples for models using jieba. ( #861 )
2024-05-11 09:21:51 +08:00
Fangjun Kuang
46e4e5b7ac
Add C++ support for streaming NeMo CTC models. ( #857 )
2024-05-10 16:26:43 +08:00