enginex-mr_series-sherpa-onnx

EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

Author	SHA1	Message	Date
Fangjun Kuang	eee5d8a15c	Add node-addon-api for VAD (#864 )	2024-05-11 20:58:23 +08:00
Fangjun Kuang	677bc1da3e	Add Speaker ID demo for C# (#862 )	2024-05-11 13:27:33 +08:00
Fangjun Kuang	a88b3bac21	Fix Python TTS examples for models using jieba. (#861 )	2024-05-11 09:21:51 +08:00
Fangjun Kuang	65f5161456	Add more streaming ASR methods for node-addon-api (#860 )	2024-05-10 18:21:05 +08:00
Fangjun Kuang	46e4e5b7ac	Add C++ support for streaming NeMo CTC models. (#857 )	2024-05-10 16:26:43 +08:00
yh646492956	1eb60e8711	Solve the issue of missing the last sentence with punctuation (#856 ) Co-authored-by: Hao You <13182720519@sina.cn>	2024-05-10 15:41:42 +08:00
Fangjun Kuang	17cd3a5f01	Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854 )	2024-05-10 12:15:39 +08:00
Fangjun Kuang	5d8c35e44e	Add C++ support for non-streaming NeMo fast conformer hybrid transducer ctc (the ctc branch) (#848 )	2024-05-09 15:32:22 +08:00
Fangjun Kuang	5ed3ec1c04	Export non-streaming NeMo faster conformer hybrid transducer and ctc to sherpa-onnx (#847 )	2024-05-09 13:59:47 +08:00
Fangjun Kuang	68b25abf27	Export NeMo FastConformer Hybrid Transducer Large Streaming to ONNX (#844 )	2024-05-08 19:07:49 +08:00
Fangjun Kuang	a9f936e92b	Export NeMo FastConformer Hybrid Transducer-CTC Large Streaming to ONNX. (#843 )	2024-05-08 12:33:46 +08:00
Fangjun Kuang	dbaa26ff4b	Publish node-addon-api npm package for linux arm64 (#841 )	2024-05-07 23:05:40 +08:00
Fangjun Kuang	d2e86b0415	Add links to pre-built APKs and pre-trained models to README. (#840 )	2024-05-07 12:28:42 +08:00
Fangjun Kuang	37a4135dd7	Publish npm package with node-addon-api for Windows (#838 )	2024-05-06 16:21:29 +08:00
Fangjun Kuang	e1bb928805	Upload two more 3d-speaker models (#837 )	2024-05-06 12:23:49 +08:00
chiiyeh	9c8255fdb2	Update 3dspeaker/export-onnx.py (#836 ) Update to match the changes in infer_sv.py at 3D-speaker. Added 2 more supported models and "zh_en" language.	2024-05-06 12:10:35 +08:00
Fangjun Kuang	4f758e6cd3	Publish node-addon-api wrapper for sherpa-onnx as npm packages (#829 )	2024-05-04 13:27:39 +08:00
Fangjun Kuang	2f9553d838	Begin to add node-addon-api for sherpa-onnx (#826 )	2024-05-03 14:47:40 +08:00
Fangjun Kuang	fcd6024200	Fix typos in JNI TTS (#824 )	2024-05-01 14:14:24 +08:00
Fangjun Kuang	cff207623e	Add Java API for speaker identification (#822 )	2024-04-29 21:23:56 +08:00
Fangjun Kuang	88202f05bb	Add Java API for audio tagging (#820 )	2024-04-28 22:26:04 +08:00
Fangjun Kuang	5407f880c0	Add Java and Kotlin API for punctuation models (#818 )	2024-04-26 22:06:48 +08:00
Fangjun Kuang	db25986240	Add Java API for spoken language identification with whisper multilingual models (#817 )	2024-04-26 19:05:39 +08:00
Fangjun Kuang	f2d074aea9	Fix a bug for offline paraformer (#816 )	2024-04-26 16:40:42 +08:00
Fangjun Kuang	612002da57	Fix C# to support Chinese tts models using jieba (#815 )	2024-04-26 11:50:07 +08:00
Fangjun Kuang	c693676d20	Fix building wheels for macOS (#814 )	2024-04-26 10:05:39 +08:00
Karel Vesely	2e45d327a5	Adding temperature scaling on Joiner logits: (#789 ) * Adding temperature scaling on Joiner logits: - T hard-coded to 2.0 - so far best result NCE 0.122 (still not so high) - the BPE scores were rescaled with 0.2 (but then also incorrect words get high confidence, visually reasonable histograms are for 0.5 scale) - BPE->WORD score merging done by min(.) function (tried also prob-product, and also arithmetic, geometric, harmonic mean) - without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best) Results seem consistent with: https://arxiv.org/abs/2110.15222 Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model. I also experimented with blank posteriors mixed into the BPE confidences, but no NCE improvement found, so not pushing that. Temperature scling added also to the Greedy search confidences. * making `temperature_scale` configurable from outside	2024-04-26 09:44:26 +08:00
Fangjun Kuang	15772d2150	Add Java API for text-to-speech (#811 )	2024-04-26 09:26:39 +08:00
Daniel Doña	fa2429920f	Add function 'tolowerUnicode' in sherpa-onnx-microphone (fix #791 ) (#812 )	2024-04-26 09:19:32 +08:00
Fangjun Kuang	f7b3735621	Add CTC HLG decoding for JNI (#810 )	2024-04-25 17:20:02 +08:00
Fangjun Kuang	6686c7d3e6	Add dict_dir arg to c api to support Chinese TTS models using jieba (#809 )	2024-04-25 12:28:31 +08:00
Fangjun Kuang	83cd533f67	Add Java API for non-streaming ASR (#807 )	2024-04-24 21:03:26 +08:00
Fangjun Kuang	c3a2e8a67c	Refactor Java API (#806 )	2024-04-24 18:41:48 +08:00
Fangjun Kuang	c7691650d7	Fix CI tests (#804 )	2024-04-24 13:01:06 +08:00
Fangjun Kuang	9b67a476e6	Refactor the JNI interface to make it more modular and maintainable (#802 )	2024-04-24 09:48:42 +08:00
布宝	dc5af04830	wget 续传 (#801 )	2024-04-22 20:19:08 +08:00
Fangjun Kuang	7f3b9ffe5d	Refactor TTS Android code to support jieba for Chinese TTS models (#800 )	2024-04-22 17:21:05 +08:00
Fangjun Kuang	494cb5c733	Fix the last character not being recognized for streaming paraformer models. (#799 )	2024-04-22 15:10:39 +08:00
Fangjun Kuang	9a68b92ce6	Increase CED's max frame length to 3000 (#798 ) so that it can process waves for up to 30 seconds.	2024-04-22 10:18:47 +08:00
Fangjun Kuang	6b353bfb42	Add jieba for Chinese TTS models (#797 )	2024-04-21 14:47:13 +08:00
Fangjun Kuang	2e0ee0e8c8	fix a typo in building language ID apk (#795 )	2024-04-19 20:16:48 +08:00
Fangjun Kuang	37831fe89c	Release v1.9.22 (#794 )	2024-04-19 18:37:47 +08:00
Fangjun Kuang	54bc504065	Add Python API example for CED audio tagging. (#793 )	2024-04-19 18:33:18 +08:00
Fangjun Kuang	c1608b3524	Support CED models (#792 )	2024-04-19 15:20:37 +08:00
Fangjun Kuang	d97a283dbb	Add Android demo for spoken language identification using Whisper multilingual models (#783 )	2024-04-18 14:33:59 +08:00
Fangjun Kuang	3a43049ba1	Add JNI support for spoken language identification (#782 )	2024-04-17 19:27:15 +08:00
Fangjun Kuang	69440e481f	Add WearOS demo for audio tagging (#777 )	2024-04-17 12:22:17 +08:00
Fangjun Kuang	bcd9e48150	Add Android demo for audio tagging (#776 ) See https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html	2024-04-16 20:47:16 +08:00
chiiyeh	aa2d695fd2	Add score function to speaker identification (#775 )	2024-04-16 17:29:46 +08:00
Fangjun Kuang	6bf2099781	Fix code style issues (#774 )	2024-04-16 09:46:15 +08:00

1 2 3 4 5 ...

556 Commits