enginex-mr_series-sherpa-onnx

EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

Author	SHA1	Message	Date
Fangjun Kuang	b2f0249420	Add recording permission for iOS App. (#900 )	2024-05-22 09:39:55 +08:00
Fangjun Kuang	4593ab49d1	Add Flutter example for speaker identification (#894 )	2024-05-21 20:38:52 +08:00
Fangjun Kuang	b445956675	Fix CI tests. (#898 )	2024-05-21 20:37:29 +08:00
Fangjun Kuang	fdcae56a14	Fix Go tests (#897 )	2024-05-21 11:50:13 +08:00
Wei Kang	b012b78ceb	Encode hotwords in C++ side (#828 ) * Encode hotwords in C++ side	2024-05-20 19:41:36 +08:00
Fangjun Kuang	8af2af8466	Add tail_paddings to Whisper C API. (#886 )	2024-05-17 09:20:07 +08:00
Fangjun Kuang	65635b09d8	Fix a typo in jni (#885 )	2024-05-16 14:31:45 +08:00
Fangjun Kuang	a421f8c1df	Fix Java API examples (#883 )	2024-05-16 12:16:17 +08:00
linziguan	d2745698c5	Support building JNI on Windows (#881 )	2024-05-16 06:25:53 +08:00
Fangjun Kuang	c2dcdabab1	Fix sherpa-onnx-node-version in node examples (#879 )	2024-05-15 14:32:30 +08:00
Fangjun Kuang	03c956a317	Add keyword spotting API for node-addon-api (#877 )	2024-05-14 20:26:48 +08:00
Fangjun Kuang	75630b986b	Support adding puncutations to text for node-addon-api (#876 )	2024-05-14 19:28:56 +08:00
Fangjun Kuang	d19f50b799	Add audio tagging APIs for node-addon-api (#875 )	2024-05-14 17:32:30 +08:00
Fangjun Kuang	388e6a98fc	Add speaker identification APIs for node-addon-api (#874 )	2024-05-14 13:28:50 +08:00
Fangjun Kuang	0895b64850	Refactor node-addon-api to remove duplicate. (#873 )	2024-05-14 10:08:11 +08:00
Fangjun Kuang	939fdd942c	Add spoken language identification for node-addon-api (#872 )	2024-05-13 20:26:11 +08:00
Fangjun Kuang	031134b4d4	Add TTS for node-addon-api (#871 )	2024-05-13 19:24:09 +08:00
Manix	740d7ae9d6	fixing bug and compiler error (#870 ) Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>	2024-05-13 17:44:03 +08:00
Fangjun Kuang	697b960768	Add non-streaming ASR APIs for node-addon-api (#868 )	2024-05-13 16:03:34 +08:00
Fangjun Kuang	384f96c40f	Add streaming CTC ASR APIs for node-addon-api (#867 )	2024-05-13 11:58:25 +08:00
Fangjun Kuang	db85b2c1d8	Add Android APKs for NeMo CTC models. (#866 )	2024-05-12 14:58:36 +08:00
Fangjun Kuang	7322f4e0a3	Fix node addon tests (#865 ) * Install naudiodon2 manually. It is needed only when using a microphone. The CI tests don't need it.	2024-05-12 12:03:43 +08:00
Fangjun Kuang	eee5d8a15c	Add node-addon-api for VAD (#864 )	2024-05-11 20:58:23 +08:00
Fangjun Kuang	677bc1da3e	Add Speaker ID demo for C# (#862 )	2024-05-11 13:27:33 +08:00
Fangjun Kuang	a88b3bac21	Fix Python TTS examples for models using jieba. (#861 )	2024-05-11 09:21:51 +08:00
Fangjun Kuang	65f5161456	Add more streaming ASR methods for node-addon-api (#860 )	2024-05-10 18:21:05 +08:00
Fangjun Kuang	46e4e5b7ac	Add C++ support for streaming NeMo CTC models. (#857 )	2024-05-10 16:26:43 +08:00
yh646492956	1eb60e8711	Solve the issue of missing the last sentence with punctuation (#856 ) Co-authored-by: Hao You <13182720519@sina.cn>	2024-05-10 15:41:42 +08:00
Fangjun Kuang	17cd3a5f01	Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854 )	2024-05-10 12:15:39 +08:00
Fangjun Kuang	5d8c35e44e	Add C++ support for non-streaming NeMo fast conformer hybrid transducer ctc (the ctc branch) (#848 )	2024-05-09 15:32:22 +08:00
Fangjun Kuang	5ed3ec1c04	Export non-streaming NeMo faster conformer hybrid transducer and ctc to sherpa-onnx (#847 )	2024-05-09 13:59:47 +08:00
Fangjun Kuang	68b25abf27	Export NeMo FastConformer Hybrid Transducer Large Streaming to ONNX (#844 )	2024-05-08 19:07:49 +08:00
Fangjun Kuang	a9f936e92b	Export NeMo FastConformer Hybrid Transducer-CTC Large Streaming to ONNX. (#843 )	2024-05-08 12:33:46 +08:00
Fangjun Kuang	dbaa26ff4b	Publish node-addon-api npm package for linux arm64 (#841 )	2024-05-07 23:05:40 +08:00
Fangjun Kuang	d2e86b0415	Add links to pre-built APKs and pre-trained models to README. (#840 )	2024-05-07 12:28:42 +08:00
Fangjun Kuang	37a4135dd7	Publish npm package with node-addon-api for Windows (#838 )	2024-05-06 16:21:29 +08:00
Fangjun Kuang	e1bb928805	Upload two more 3d-speaker models (#837 )	2024-05-06 12:23:49 +08:00
chiiyeh	9c8255fdb2	Update 3dspeaker/export-onnx.py (#836 ) Update to match the changes in infer_sv.py at 3D-speaker. Added 2 more supported models and "zh_en" language.	2024-05-06 12:10:35 +08:00
Fangjun Kuang	4f758e6cd3	Publish node-addon-api wrapper for sherpa-onnx as npm packages (#829 )	2024-05-04 13:27:39 +08:00
Fangjun Kuang	2f9553d838	Begin to add node-addon-api for sherpa-onnx (#826 )	2024-05-03 14:47:40 +08:00
Fangjun Kuang	fcd6024200	Fix typos in JNI TTS (#824 )	2024-05-01 14:14:24 +08:00
Fangjun Kuang	cff207623e	Add Java API for speaker identification (#822 )	2024-04-29 21:23:56 +08:00
Fangjun Kuang	88202f05bb	Add Java API for audio tagging (#820 )	2024-04-28 22:26:04 +08:00
Fangjun Kuang	5407f880c0	Add Java and Kotlin API for punctuation models (#818 )	2024-04-26 22:06:48 +08:00
Fangjun Kuang	db25986240	Add Java API for spoken language identification with whisper multilingual models (#817 )	2024-04-26 19:05:39 +08:00
Fangjun Kuang	f2d074aea9	Fix a bug for offline paraformer (#816 )	2024-04-26 16:40:42 +08:00
Fangjun Kuang	612002da57	Fix C# to support Chinese tts models using jieba (#815 )	2024-04-26 11:50:07 +08:00
Fangjun Kuang	c693676d20	Fix building wheels for macOS (#814 )	2024-04-26 10:05:39 +08:00
Karel Vesely	2e45d327a5	Adding temperature scaling on Joiner logits: (#789 ) * Adding temperature scaling on Joiner logits: - T hard-coded to 2.0 - so far best result NCE 0.122 (still not so high) - the BPE scores were rescaled with 0.2 (but then also incorrect words get high confidence, visually reasonable histograms are for 0.5 scale) - BPE->WORD score merging done by min(.) function (tried also prob-product, and also arithmetic, geometric, harmonic mean) - without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best) Results seem consistent with: https://arxiv.org/abs/2110.15222 Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model. I also experimented with blank posteriors mixed into the BPE confidences, but no NCE improvement found, so not pushing that. Temperature scling added also to the Greedy search confidences. * making `temperature_scale` configurable from outside	2024-04-26 09:44:26 +08:00
Fangjun Kuang	15772d2150	Add Java API for text-to-speech (#811 )	2024-04-26 09:26:39 +08:00

1 2 3 4 5 ...

578 Commits