9728Lin
9edb78e21b
Update c-api.h to hotwords ( #962 )
2024-06-03 16:26:12 +08:00
Fangjun Kuang
a02e43d83c
Wrap offline ASR APIs to dart ( #961 )
2024-06-02 19:11:27 +08:00
Fangjun Kuang
f1cff83ef9
Add address sanitizer and undefined behavior sanitizer ( #951 )
2024-05-31 13:17:01 +08:00
Wei Kang
a38881817c
Support customize scores for hotwords ( #926 )
...
* Support customize scores for hotwords
* Skip blank lines
2024-05-31 12:34:30 +08:00
Fangjun Kuang
a689249f88
Fix building for Android ( #949 )
2024-05-31 10:27:29 +08:00
Fangjun Kuang
082f230dfb
Fix nemo streaming transducer greedy search ( #944 )
2024-05-30 15:31:10 +08:00
Sangeet Sagar
3f472a9993
Add C++ runtime for *streaming* faster conformer transducer from NeMo. ( #889 )
...
Co-authored-by: sangeet2020 <15uec053@gmail.com >
2024-05-30 13:55:03 +08:00
Fangjun Kuang
49d66ec358
Add Dart API for streaming ASR ( #933 )
2024-05-30 12:21:09 +08:00
Leo Huang
d45223034c
Added tokens, tokens_arr and json for offline recongnizer result ( #936 )
...
Co-authored-by: leo <webmaster@360converter.com >
2024-05-29 12:53:28 +08:00
FakeEnd
a6c9b7986f
Changed the comment to the API GetKeywordResult input parameter description ( #937 )
2024-05-29 12:45:58 +08:00
Fangjun Kuang
50a2eaa41f
Reset encoder states on endpointing for streaming transducer. ( #924 )
2024-05-28 17:06:17 +08:00
Fangjun Kuang
5860e45b4c
Add KWS examples for Java API ( #930 )
2024-05-28 15:49:54 +08:00
Fangjun Kuang
bcaa6df389
Add VAD demo for Java API ( #928 )
2024-05-28 14:59:47 +08:00
hantengc
1371c6b3f0
提供设置关键词的api,方便动态调整关键词来进行识别 ( #923 )
2024-05-27 19:07:26 +08:00
Fangjun Kuang
49ea59d4ff
Add Flutter GUI example for VAD with a microphone. ( #905 )
2024-05-24 23:48:12 +08:00
Dadoou
4fc0a1dc64
Update offline-ctc-greedy-search-decoder.cc ( #917 )
...
Bug fixes.
Z_O_O will be decoded as ZO instead of ZOO.
To fix this, prev_id should update every time.
2024-05-24 22:31:56 +08:00
Fangjun Kuang
cf83412d0a
Support reading waves from NAudio. ( #914 )
2024-05-24 11:07:44 +08:00
Fangjun Kuang
2db777587e
Fix CI tests. ( #907 )
2024-05-23 14:49:37 +08:00
Fangjun Kuang
49ee458bfb
Add Dart API for VAD ( #904 )
2024-05-22 21:56:21 +08:00
Fangjun Kuang
81346d1172
Fix reading wave files generated by NAudio. ( #903 )
2024-05-22 19:56:06 +08:00
Fangjun Kuang
4f21aabd3c
Fix CI for JavaScript and Python APIs. ( #901 )
2024-05-22 13:57:00 +08:00
Fangjun Kuang
4593ab49d1
Add Flutter example for speaker identification ( #894 )
2024-05-21 20:38:52 +08:00
Wei Kang
b012b78ceb
Encode hotwords in C++ side ( #828 )
...
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Fangjun Kuang
8af2af8466
Add tail_paddings to Whisper C API. ( #886 )
2024-05-17 09:20:07 +08:00
Fangjun Kuang
65635b09d8
Fix a typo in jni ( #885 )
2024-05-16 14:31:45 +08:00
Fangjun Kuang
a421f8c1df
Fix Java API examples ( #883 )
2024-05-16 12:16:17 +08:00
linziguan
d2745698c5
Support building JNI on Windows ( #881 )
2024-05-16 06:25:53 +08:00
Fangjun Kuang
03c956a317
Add keyword spotting API for node-addon-api ( #877 )
2024-05-14 20:26:48 +08:00
Fangjun Kuang
031134b4d4
Add TTS for node-addon-api ( #871 )
2024-05-13 19:24:09 +08:00
Manix
740d7ae9d6
fixing bug and compiler error ( #870 )
...
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com >
2024-05-13 17:44:03 +08:00
Fangjun Kuang
384f96c40f
Add streaming CTC ASR APIs for node-addon-api ( #867 )
2024-05-13 11:58:25 +08:00
Fangjun Kuang
db85b2c1d8
Add Android APKs for NeMo CTC models. ( #866 )
2024-05-12 14:58:36 +08:00
Fangjun Kuang
7322f4e0a3
Fix node addon tests ( #865 )
...
* Install naudiodon2 manually.
It is needed only when using a microphone. The CI tests don't need it.
2024-05-12 12:03:43 +08:00
Fangjun Kuang
46e4e5b7ac
Add C++ support for streaming NeMo CTC models. ( #857 )
2024-05-10 16:26:43 +08:00
yh646492956
1eb60e8711
Solve the issue of missing the last sentence with punctuation ( #856 )
...
Co-authored-by: Hao You <13182720519@sina.cn >
2024-05-10 15:41:42 +08:00
Fangjun Kuang
17cd3a5f01
Add C++ runtime for non-streaming faster conformer transducer from NeMo. ( #854 )
2024-05-10 12:15:39 +08:00
Fangjun Kuang
5d8c35e44e
Add C++ support for non-streaming NeMo fast conformer hybrid transducer ctc (the ctc branch) ( #848 )
2024-05-09 15:32:22 +08:00
Fangjun Kuang
fcd6024200
Fix typos in JNI TTS ( #824 )
2024-05-01 14:14:24 +08:00
Fangjun Kuang
cff207623e
Add Java API for speaker identification ( #822 )
2024-04-29 21:23:56 +08:00
Fangjun Kuang
88202f05bb
Add Java API for audio tagging ( #820 )
2024-04-28 22:26:04 +08:00
Fangjun Kuang
5407f880c0
Add Java and Kotlin API for punctuation models ( #818 )
2024-04-26 22:06:48 +08:00
Fangjun Kuang
db25986240
Add Java API for spoken language identification with whisper multilingual models ( #817 )
2024-04-26 19:05:39 +08:00
Fangjun Kuang
f2d074aea9
Fix a bug for offline paraformer ( #816 )
2024-04-26 16:40:42 +08:00
Fangjun Kuang
612002da57
Fix C# to support Chinese tts models using jieba ( #815 )
2024-04-26 11:50:07 +08:00
Karel Vesely
2e45d327a5
Adding temperature scaling on Joiner logits: ( #789 )
...
* Adding temperature scaling on Joiner logits:
- T hard-coded to 2.0
- so far best result NCE 0.122 (still not so high)
- the BPE scores were rescaled with 0.2 (but then also incorrect words
get high confidence, visually reasonable histograms are for 0.5 scale)
- BPE->WORD score merging done by min(.) function
(tried also prob-product, and also arithmetic, geometric, harmonic mean)
- without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best)
Results seem consistent with: https://arxiv.org/abs/2110.15222
Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model.
I also experimented with blank posteriors mixed into the BPE confidences,
but no NCE improvement found, so not pushing that.
Temperature scling added also to the Greedy search confidences.
* making `temperature_scale` configurable from outside
2024-04-26 09:44:26 +08:00
Fangjun Kuang
15772d2150
Add Java API for text-to-speech ( #811 )
2024-04-26 09:26:39 +08:00
Daniel Doña
fa2429920f
Add function 'tolowerUnicode' in sherpa-onnx-microphone ( fix #791 ) ( #812 )
2024-04-26 09:19:32 +08:00
Fangjun Kuang
f7b3735621
Add CTC HLG decoding for JNI ( #810 )
2024-04-25 17:20:02 +08:00
Fangjun Kuang
6686c7d3e6
Add dict_dir arg to c api to support Chinese TTS models using jieba ( #809 )
2024-04-25 12:28:31 +08:00
Fangjun Kuang
83cd533f67
Add Java API for non-streaming ASR ( #807 )
2024-04-24 21:03:26 +08:00