Commit Graph

78 Commits

Author SHA1 Message Date
Fangjun Kuang
9d25c90a59 Add JavaScript API (node-addon) for homophone replacer (#2158) 2025-04-28 20:52:42 +08:00
Fangjun Kuang
eee5575836 Add Kotlin and Java API for Dolphin CTC models (#2086) 2025-04-02 21:16:14 +08:00
Fangjun Kuang
3420c89883 Export silero_vad v4 to RKNN (#2067) 2025-03-30 12:00:52 +08:00
cjsdurj
b87fce9a7f c-api add wave write to buffer. (#1962)
Co-authored-by: jian.chen03 <jian.chen03@transwarp.io>
2025-03-10 17:21:23 +08:00
ivan provalov
94728bfbee Fixing Whisper Model Token Normalization (#1904) 2025-02-21 12:58:01 +08:00
Fangjun Kuang
316424b382 Add C++ and Python API for FireRedASR AED models (#1867) 2025-02-16 22:45:24 +08:00
Fangjun Kuang
c84a833863 Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795) 2025-02-06 22:57:13 +08:00
Fangjun Kuang
08cefe8488 Export Kokoro 1.0 to sherpa-onnx (#1788) 2025-02-05 08:24:43 +08:00
Fangjun Kuang
af671e2b63 Add C API for Kokoro TTS models (#1717) 2025-01-16 15:07:26 +08:00
Fangjun Kuang
a00d3b4821 Add Java API for Matcha-TTS models. (#1673) 2025-01-02 15:15:30 +08:00
Fangjun Kuang
3422b9388d Add Kotlin API for Matcha-TTS models. (#1668) 2024-12-31 19:20:52 +08:00
Fangjun Kuang
314545f938 Add speaker identification APIs for HarmonyOS (#1607)
* Add speaker embedding extractor API for HarmonyOS

* Add ArkTS API for speaker identification
2024-12-09 19:23:18 +08:00
Fangjun Kuang
bd4b223920 Add Kotlin and Java API for Moonshine models (#1474) 2024-10-26 22:30:29 +08:00
Fangjun Kuang
d468527f62 C API for speaker diarization (#1402) 2024-10-09 17:10:03 +08:00
Fangjun Kuang
70165cb42d Speaker diarization example with onnxruntime Python API (#1395) 2024-10-06 16:37:29 +08:00
Lim Yao Chong
3bffc24d64 Add Python binding for online punctuation models (#1312) 2024-09-09 10:26:53 +08:00
Fangjun Kuang
6b8877f185 Downgrade flutter sdk versions. (#1305) 2024-08-30 11:47:27 +08:00
Fangjun Kuang
65f1c0fab2 Add Pascal API for reading wave files (#1243) 2024-08-11 22:43:42 +08:00
Fangjun Kuang
94e256244d Add blank penalty for various language bindings. (#1234) 2024-08-08 10:43:31 +08:00
Fangjun Kuang
994c3e7c96 Add VAD + Non-streaming ASR example for JavaScript API. (#1170) 2024-07-26 12:42:08 +08:00
Fangjun Kuang
25f0a10468 Add C++ runtime for SenseVoice models (#1148) 2024-07-18 22:54:18 +08:00
Fangjun Kuang
dd0ff2ca06 Support onnxruntime 1.18.0 (#906) 2024-07-10 17:05:26 +08:00
Fangjun Kuang
1fe12c5107 Support the platform iOS for Flutter (#1079) 2024-07-06 19:43:37 +08:00
Fangjun Kuang
f5e9a162d1 Publish flutter packages for Android (#1074) 2024-07-04 20:07:07 +08:00
Fangjun Kuang
6e09933d99 Inverse text normalization API for other programming languages (#1019) 2024-06-17 17:02:39 +08:00
Fangjun Kuang
fd5a0d1e00 Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970) 2024-06-05 00:26:40 +08:00
Fangjun Kuang
031134b4d4 Add TTS for node-addon-api (#871) 2024-05-13 19:24:09 +08:00
Fangjun Kuang
17cd3a5f01 Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854) 2024-05-10 12:15:39 +08:00
Fangjun Kuang
2f9553d838 Begin to add node-addon-api for sherpa-onnx (#826) 2024-05-03 14:47:40 +08:00
Fangjun Kuang
88202f05bb Add Java API for audio tagging (#820) 2024-04-28 22:26:04 +08:00
Fangjun Kuang
f2d074aea9 Fix a bug for offline paraformer (#816) 2024-04-26 16:40:42 +08:00
Fangjun Kuang
9b67a476e6 Refactor the JNI interface to make it more modular and maintainable (#802) 2024-04-24 09:48:42 +08:00
Fangjun Kuang
7f3b9ffe5d Refactor TTS Android code to support jieba for Chinese TTS models (#800) 2024-04-22 17:21:05 +08:00
Fangjun Kuang
3a43049ba1 Add JNI support for spoken language identification (#782) 2024-04-17 19:27:15 +08:00
Fangjun Kuang
68b8b88b5a Add Python API for punctuation models. (#762) 2024-04-13 13:28:17 +08:00
Fangjun Kuang
a5f8fbc83f Support heteronyms in Chinese TTS (#738) 2024-04-08 11:01:30 +08:00
Fangjun Kuang
6da4a1c12f Add Go API for speaker identification (#718) 2024-03-29 19:25:55 +08:00
Fangjun Kuang
2e0bccad36 Add C API for speaker embedding extractor. (#711) 2024-03-28 18:05:40 +08:00
Fangjun Kuang
69c7880c4d Add Golang API for VAD (#708) 2024-03-27 12:09:39 +08:00
Fangjun Kuang
ab7cff2513 Add C API for spoken language identification. (#695) 2024-03-25 15:16:47 +08:00
Karel Vesely
38c072dcb2 Track token scores (#571)
* add export of per-token scores (ys, lm, context)

- for best path of the modified-beam-search decoding of transducer

* refactoring JSON export of OnlineRecognitionResult, extending pybind11 API of OnlineRecognitionResult

* export per-token scores also for greedy-search (online-transducer)

- export un-scaled lm_probs (modified-beam search, online-transducer)
- polishing

* fill lm_probs/context_scores only if LM/ContextGraph is present (make Result smaller)
2024-02-29 06:28:45 +08:00
Fangjun Kuang
16ba7e274a Add WebAssembly for ASR (#604) 2024-02-23 17:39:11 +08:00
Fangjun Kuang
73afa0248b Support playing generated audio as it is generating for MFC. (#462)
* Support playing generated audio as it is generating for MFC.

* support espeak-ng-data
2023-12-04 14:23:38 +08:00
Fangjun Kuang
62dc3c3e46 Use piper-phonemize to convert text to token IDs (#453) 2023-11-30 23:57:43 +08:00
Fangjun Kuang
94ef6929bb Text-to-speech for iOS (#443) 2023-11-23 21:38:32 +08:00
Fangjun Kuang
fe977b8e8e support nodejs (#438) 2023-11-21 23:20:08 +08:00
Fangjun Kuang
ea7c45b60c Add C API for offline TTS. (#373) 2023-10-19 17:38:23 +08:00
Fangjun Kuang
7649bd862c Fix building APKs (#337) 2023-09-24 14:16:14 +08:00
Fangjun Kuang
debab7c091 Add two-pass speech recognition Android/iOS demo (#304) 2023-09-12 15:40:16 +08:00
Fangjun Kuang
eb5ae18015 Fix C# API to support streaming Paraformer (#266) 2023-08-14 15:27:54 +08:00