Commit Graph

122 Commits

Author SHA1 Message Date
Fangjun Kuang
03c956a317 Add keyword spotting API for node-addon-api (#877) 2024-05-14 20:26:48 +08:00
Fangjun Kuang
75630b986b Support adding puncutations to text for node-addon-api (#876) 2024-05-14 19:28:56 +08:00
Fangjun Kuang
d19f50b799 Add audio tagging APIs for node-addon-api (#875) 2024-05-14 17:32:30 +08:00
Fangjun Kuang
388e6a98fc Add speaker identification APIs for node-addon-api (#874) 2024-05-14 13:28:50 +08:00
Fangjun Kuang
0895b64850 Refactor node-addon-api to remove duplicate. (#873) 2024-05-14 10:08:11 +08:00
Fangjun Kuang
939fdd942c Add spoken language identification for node-addon-api (#872) 2024-05-13 20:26:11 +08:00
Fangjun Kuang
031134b4d4 Add TTS for node-addon-api (#871) 2024-05-13 19:24:09 +08:00
Fangjun Kuang
697b960768 Add non-streaming ASR APIs for node-addon-api (#868) 2024-05-13 16:03:34 +08:00
Fangjun Kuang
384f96c40f Add streaming CTC ASR APIs for node-addon-api (#867) 2024-05-13 11:58:25 +08:00
Fangjun Kuang
db85b2c1d8 Add Android APKs for NeMo CTC models. (#866) 2024-05-12 14:58:36 +08:00
Fangjun Kuang
7322f4e0a3 Fix node addon tests (#865)
* Install naudiodon2 manually.

It is needed only when using a microphone. The CI tests don't need it.
2024-05-12 12:03:43 +08:00
Fangjun Kuang
eee5d8a15c Add node-addon-api for VAD (#864) 2024-05-11 20:58:23 +08:00
Fangjun Kuang
677bc1da3e Add Speaker ID demo for C# (#862) 2024-05-11 13:27:33 +08:00
Fangjun Kuang
65f5161456 Add more streaming ASR methods for node-addon-api (#860) 2024-05-10 18:21:05 +08:00
Fangjun Kuang
46e4e5b7ac Add C++ support for streaming NeMo CTC models. (#857) 2024-05-10 16:26:43 +08:00
Fangjun Kuang
5ed3ec1c04 Export non-streaming NeMo faster conformer hybrid transducer and ctc to sherpa-onnx (#847) 2024-05-09 13:59:47 +08:00
Fangjun Kuang
68b25abf27 Export NeMo FastConformer Hybrid Transducer Large Streaming to ONNX (#844) 2024-05-08 19:07:49 +08:00
Fangjun Kuang
a9f936e92b Export NeMo FastConformer Hybrid Transducer-CTC Large Streaming to ONNX. (#843) 2024-05-08 12:33:46 +08:00
Fangjun Kuang
d2e86b0415 Add links to pre-built APKs and pre-trained models to README. (#840) 2024-05-07 12:28:42 +08:00
Fangjun Kuang
37a4135dd7 Publish npm package with node-addon-api for Windows (#838) 2024-05-06 16:21:29 +08:00
Fangjun Kuang
e1bb928805 Upload two more 3d-speaker models (#837) 2024-05-06 12:23:49 +08:00
chiiyeh
9c8255fdb2 Update 3dspeaker/export-onnx.py (#836)
Update to match the changes in infer_sv.py at 3D-speaker. 

Added 2 more supported models and "zh_en" language.
2024-05-06 12:10:35 +08:00
Fangjun Kuang
4f758e6cd3 Publish node-addon-api wrapper for sherpa-onnx as npm packages (#829) 2024-05-04 13:27:39 +08:00
Fangjun Kuang
2f9553d838 Begin to add node-addon-api for sherpa-onnx (#826) 2024-05-03 14:47:40 +08:00
Fangjun Kuang
612002da57 Fix C# to support Chinese tts models using jieba (#815) 2024-04-26 11:50:07 +08:00
Fangjun Kuang
6686c7d3e6 Add dict_dir arg to c api to support Chinese TTS models using jieba (#809) 2024-04-25 12:28:31 +08:00
Fangjun Kuang
9b67a476e6 Refactor the JNI interface to make it more modular and maintainable (#802) 2024-04-24 09:48:42 +08:00
Fangjun Kuang
7f3b9ffe5d Refactor TTS Android code to support jieba for Chinese TTS models (#800) 2024-04-22 17:21:05 +08:00
Fangjun Kuang
9a68b92ce6 Increase CED's max frame length to 3000 (#798)
so that it can process waves for up to 30 seconds.
2024-04-22 10:18:47 +08:00
Fangjun Kuang
2e0ee0e8c8 fix a typo in building language ID apk (#795) 2024-04-19 20:16:48 +08:00
Fangjun Kuang
c1608b3524 Support CED models (#792) 2024-04-19 15:20:37 +08:00
Fangjun Kuang
d97a283dbb Add Android demo for spoken language identification using Whisper multilingual models (#783) 2024-04-18 14:33:59 +08:00
Fangjun Kuang
69440e481f Add WearOS demo for audio tagging (#777) 2024-04-17 12:22:17 +08:00
Fangjun Kuang
bcd9e48150 Add Android demo for audio tagging (#776)
See https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html
2024-04-16 20:47:16 +08:00
Fangjun Kuang
0d90b34e4a Support Chinese heteronyms on Android for TTS. (#742) 2024-04-08 21:36:47 +08:00
Fangjun Kuang
a5f8fbc83f Support heteronyms in Chinese TTS (#738) 2024-04-08 11:01:30 +08:00
Fangjun Kuang
dbff2eaadb Add C API for streaming HLG decoding (#734) 2024-04-05 10:31:20 +08:00
Fangjun Kuang
3acf373b07 add more piper models (#725) 2024-04-01 11:39:52 +08:00
Fangjun Kuang
6da4a1c12f Add Go API for speaker identification (#718) 2024-03-29 19:25:55 +08:00
Fangjun Kuang
a042f44076 Add Golang API for spoken language identification. (#709) 2024-03-27 19:40:25 +08:00
Fangjun Kuang
12efbf7397 Sign released TTS APKs (#710) 2024-03-27 19:34:37 +08:00
Fangjun Kuang
69c7880c4d Add Golang API for VAD (#708) 2024-03-27 12:09:39 +08:00
Fangjun Kuang
bd66f7a7d0 Build Android TTS APKs for coqui-ai/TTS models (#704) 2024-03-26 14:05:26 +08:00
Fangjun Kuang
305c373107 Add C# API for spoken language identification (#697) 2024-03-25 18:45:09 +08:00
Fangjun Kuang
1952772654 Add timestamps and tokens for .Net's online models. (#690) 2024-03-23 18:51:56 +08:00
Fangjun Kuang
24f437a6f1 Refactor github actions tests (#688) 2024-03-22 21:22:42 +08:00
Fangjun Kuang
c8770aec20 Add nuget package for Windows x86 (#683) 2024-03-21 14:57:01 +08:00
Fangjun Kuang
acf0975153 Support whisper language/task in various language bindings. (#679) 2024-03-20 16:43:35 +08:00
Fangjun Kuang
6571fc9552 Add tts play example for .Net. (#676)
It plays the generated audio via a speaker as it is generating.
2024-03-19 17:33:15 +08:00
foreversimon
ce60100f68 Add HotwordsFile and HotwordsScore fields to OnlineRecognizerConfig in C# API (#675) 2024-03-19 15:04:08 +08:00