enginex-mr_series-sherpa-onnx

EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

Author	SHA1	Message	Date
Fangjun Kuang	49ea59d4ff	Add Flutter GUI example for VAD with a microphone. (#905 )	2024-05-24 23:48:12 +08:00
Dadoou	4fc0a1dc64	Update offline-ctc-greedy-search-decoder.cc (#917 ) Bug fixes. Z_O_O will be decoded as ZO instead of ZOO. To fix this, prev_id should update every time.	2024-05-24 22:31:56 +08:00
Fangjun Kuang	cf83412d0a	Support reading waves from NAudio. (#914 )	2024-05-24 11:07:44 +08:00
Fangjun Kuang	2db777587e	Fix CI tests. (#907 )	2024-05-23 14:49:37 +08:00
Fangjun Kuang	81346d1172	Fix reading wave files generated by NAudio. (#903 )	2024-05-22 19:56:06 +08:00
Wei Kang	b012b78ceb	Encode hotwords in C++ side (#828 ) * Encode hotwords in C++ side	2024-05-20 19:41:36 +08:00
Manix	740d7ae9d6	fixing bug and compiler error (#870 ) Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>	2024-05-13 17:44:03 +08:00
Fangjun Kuang	384f96c40f	Add streaming CTC ASR APIs for node-addon-api (#867 )	2024-05-13 11:58:25 +08:00
Fangjun Kuang	db85b2c1d8	Add Android APKs for NeMo CTC models. (#866 )	2024-05-12 14:58:36 +08:00
Fangjun Kuang	7322f4e0a3	Fix node addon tests (#865 ) * Install naudiodon2 manually. It is needed only when using a microphone. The CI tests don't need it.	2024-05-12 12:03:43 +08:00
Fangjun Kuang	46e4e5b7ac	Add C++ support for streaming NeMo CTC models. (#857 )	2024-05-10 16:26:43 +08:00
yh646492956	1eb60e8711	Solve the issue of missing the last sentence with punctuation (#856 ) Co-authored-by: Hao You <13182720519@sina.cn>	2024-05-10 15:41:42 +08:00
Fangjun Kuang	17cd3a5f01	Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854 )	2024-05-10 12:15:39 +08:00
Fangjun Kuang	5d8c35e44e	Add C++ support for non-streaming NeMo fast conformer hybrid transducer ctc (the ctc branch) (#848 )	2024-05-09 15:32:22 +08:00
Fangjun Kuang	5407f880c0	Add Java and Kotlin API for punctuation models (#818 )	2024-04-26 22:06:48 +08:00
Fangjun Kuang	f2d074aea9	Fix a bug for offline paraformer (#816 )	2024-04-26 16:40:42 +08:00
Fangjun Kuang	612002da57	Fix C# to support Chinese tts models using jieba (#815 )	2024-04-26 11:50:07 +08:00
Karel Vesely	2e45d327a5	Adding temperature scaling on Joiner logits: (#789 ) * Adding temperature scaling on Joiner logits: - T hard-coded to 2.0 - so far best result NCE 0.122 (still not so high) - the BPE scores were rescaled with 0.2 (but then also incorrect words get high confidence, visually reasonable histograms are for 0.5 scale) - BPE->WORD score merging done by min(.) function (tried also prob-product, and also arithmetic, geometric, harmonic mean) - without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best) Results seem consistent with: https://arxiv.org/abs/2110.15222 Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model. I also experimented with blank posteriors mixed into the BPE confidences, but no NCE improvement found, so not pushing that. Temperature scling added also to the Greedy search confidences. * making `temperature_scale` configurable from outside	2024-04-26 09:44:26 +08:00
Daniel Doña	fa2429920f	Add function 'tolowerUnicode' in sherpa-onnx-microphone (fix #791 ) (#812 )	2024-04-26 09:19:32 +08:00
Fangjun Kuang	c3a2e8a67c	Refactor Java API (#806 )	2024-04-24 18:41:48 +08:00
Fangjun Kuang	9b67a476e6	Refactor the JNI interface to make it more modular and maintainable (#802 )	2024-04-24 09:48:42 +08:00
Fangjun Kuang	7f3b9ffe5d	Refactor TTS Android code to support jieba for Chinese TTS models (#800 )	2024-04-22 17:21:05 +08:00
Fangjun Kuang	494cb5c733	Fix the last character not being recognized for streaming paraformer models. (#799 )	2024-04-22 15:10:39 +08:00
Fangjun Kuang	6b353bfb42	Add jieba for Chinese TTS models (#797 )	2024-04-21 14:47:13 +08:00
Fangjun Kuang	c1608b3524	Support CED models (#792 )	2024-04-19 15:20:37 +08:00
Fangjun Kuang	d97a283dbb	Add Android demo for spoken language identification using Whisper multilingual models (#783 )	2024-04-18 14:33:59 +08:00
chiiyeh	aa2d695fd2	Add score function to speaker identification (#775 )	2024-04-16 17:29:46 +08:00
Fangjun Kuang	6bf2099781	Fix code style issues (#774 )	2024-04-16 09:46:15 +08:00
Fangjun Kuang	81b7f1d529	Fix display for sherpa-onnx-microphone (#773 )	2024-04-16 09:17:23 +08:00
Manix	fb4aee83ac	Adding warm up for Zipformer2 (#766 ) Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>	2024-04-16 09:16:55 +08:00
Fangjun Kuang	5981adf454	Add Kotlin API for audio tagging (#770 )	2024-04-15 13:49:35 +08:00
Fangjun Kuang	13730ecbd8	Add C API for punctuation (#768 )	2024-04-14 19:02:34 +08:00
Fangjun Kuang	983df28a83	Fix a punctuation bug (#764 )	2024-04-13 19:08:46 +08:00
Fangjun Kuang	329fe1aa8b	Support adding punctuations to the speech recogntion result (#761 )	2024-04-13 12:15:57 +08:00
Manix	399d920b47	[feature] Configurable padding length in online websocket server (#755 ) Signed-off-by: manickavela29 <manickavela1998@gmail.com>	2024-04-11 14:57:11 +08:00
AHN Sung Hwan	904a3cc8a9	Fix a bug in mean calculation of 'ys_probs' (#748 )	2024-04-11 10:34:44 +08:00
Fangjun Kuang	042976ea6e	Add C++ microphone examples for audio tagging (#749 )	2024-04-10 21:00:35 +08:00
Fangjun Kuang	f20291cadc	Support audio tagging using zipformer (#747 )	2024-04-10 14:47:06 +08:00
Fangjun Kuang	0d90b34e4a	Support Chinese heteronyms on Android for TTS. (#742 )	2024-04-08 21:36:47 +08:00
Fangjun Kuang	6fb8ceda57	Add VAD examples using ALSA for recording (#739 )	2024-04-08 16:41:01 +08:00
Fangjun Kuang	a5f8fbc83f	Support heteronyms in Chinese TTS (#738 )	2024-04-08 11:01:30 +08:00
Fangjun Kuang	db67e00c77	Add HLG decoding for streaming CTC models (#731 )	2024-04-03 21:31:42 +08:00
Fangjun Kuang	2e0bccad36	Add C API for speaker embedding extractor. (#711 )	2024-03-28 18:05:40 +08:00
Leo Huang	638f48f47a	Added progress for callback of tts generator (#712 ) Co-authored-by: leohwang <leohwang@360converter.com>	2024-03-28 17:12:20 +08:00
Fangjun Kuang	a042f44076	Add Golang API for spoken language identification. (#709 )	2024-03-27 19:40:25 +08:00
Fangjun Kuang	4e040c596e	Support including TTS conditionally. (#699 )	2024-03-26 17:21:35 +08:00
Fangjun Kuang	d364610605	Use a single thread when loading models (#703 )	2024-03-26 13:35:33 +08:00
Fangjun Kuang	0d258dd150	Support spoken language identification with whisper (#694 )	2024-03-24 22:57:00 +08:00
Fangjun Kuang	1952772654	Add timestamps and tokens for .Net's online models. (#690 )	2024-03-23 18:51:56 +08:00
Karel Vesely	eaec4c83c2	Configurable low_freq high_freq, dithering (#664 )	2024-03-22 21:41:44 +08:00

1 2 3 4 5 ...

251 Commits