enginex-mr_series-sherpa-onnx

EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

Author	SHA1	Message	Date
Fangjun Kuang	2c2926af7d	Add C++ runtime for Matcha-TTS (#1627 )	2024-12-31 12:44:14 +08:00
Fangjun Kuang	669f5ef441	Add C++ runtime and Python APIs for Moonshine models (#1473 )	2024-10-26 14:34:07 +08:00
Fangjun Kuang	8535b1d3bb	Python API for speaker diarization. (#1400 )	2024-10-09 14:13:26 +08:00
Fangjun Kuang	b965f14cf0	Add Python API for clustering (#1385 )	2024-09-30 11:33:15 +08:00
Lim Yao Chong	3bffc24d64	Add Python binding for online punctuation models (#1312 )	2024-09-09 10:26:53 +08:00
SilverSulfide	888f74bf3c	Re-implement LM rescore for online transducer (#1231 ) Co-authored-by: Martins Kronis <martins.kuznecovs@tilde.lv>	2024-09-06 10:01:25 +08:00
xsjk	1da75ee3c0	Fix typo in offline-lm-config.cc (#1229 )	2024-08-07 15:38:34 +08:00
Fangjun Kuang	25f0a10468	Add C++ runtime for SenseVoice models (#1148 )	2024-07-18 22:54:18 +08:00
Wei Kang	5b1fa8750f	Fix hotwords OOV log (#1139 )	2024-07-16 19:41:31 +08:00
Manix	55decb7bee	Add config for TensorRT and CUDA execution provider (#992 ) Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com> Signed-off-by: manickavela1998@gmail.com <manickavela.arumugam@uniphore.com>	2024-07-05 15:18:37 +08:00
SilverSulfide	656b9fa1c8	Add Python API support for Offline LM rescoring (#1033 )	2024-06-19 16:29:37 +08:00
Fangjun Kuang	349d957da2	Add inverse text normalization for online ASR (#1020 )	2024-06-17 18:39:23 +08:00
Fangjun Kuang	b0f7ed3ee3	Add inverse text normalization for non-streaming ASR (#1017 )	2024-06-17 14:28:53 +08:00
Fangjun Kuang	fd5a0d1e00	Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970 )	2024-06-05 00:26:40 +08:00
Wei Kang	b012b78ceb	Encode hotwords in C++ side (#828 ) * Encode hotwords in C++ side	2024-05-20 19:41:36 +08:00
Fangjun Kuang	46e4e5b7ac	Add C++ support for streaming NeMo CTC models. (#857 )	2024-05-10 16:26:43 +08:00
Fangjun Kuang	17cd3a5f01	Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854 )	2024-05-10 12:15:39 +08:00
Karel Vesely	2e45d327a5	Adding temperature scaling on Joiner logits: (#789 ) * Adding temperature scaling on Joiner logits: - T hard-coded to 2.0 - so far best result NCE 0.122 (still not so high) - the BPE scores were rescaled with 0.2 (but then also incorrect words get high confidence, visually reasonable histograms are for 0.5 scale) - BPE->WORD score merging done by min(.) function (tried also prob-product, and also arithmetic, geometric, harmonic mean) - without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best) Results seem consistent with: https://arxiv.org/abs/2110.15222 Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model. I also experimented with blank posteriors mixed into the BPE confidences, but no NCE improvement found, so not pushing that. Temperature scling added also to the Greedy search confidences. * making `temperature_scale` configurable from outside	2024-04-26 09:44:26 +08:00
Fangjun Kuang	68b8b88b5a	Add Python API for punctuation models. (#762 )	2024-04-13 13:28:17 +08:00
Fangjun Kuang	34d70a259f	Add Python API and Python examples for audio tagging (#753 )	2024-04-11 11:12:48 +08:00
Fangjun Kuang	6fb8ceda57	Add VAD examples using ALSA for recording (#739 )	2024-04-08 16:41:01 +08:00
Fangjun Kuang	db67e00c77	Add HLG decoding for streaming CTC models (#731 )	2024-04-03 21:31:42 +08:00
Fangjun Kuang	0d258dd150	Support spoken language identification with whisper (#694 )	2024-03-24 22:57:00 +08:00
Karel Vesely	eaec4c83c2	Configurable low_freq high_freq, dithering (#664 )	2024-03-22 21:41:44 +08:00
Bhaswati Saha	fda614d0d1	beam search value as parameter in offline_recognizer.py (#673 ) Co-authored-by: bhascns <bhaswati@mihup.com>	2024-03-18 18:43:05 +08:00
Fangjun Kuang	d3287f9494	Add Python ASR examples with alsa (#646 )	2024-03-08 11:34:48 +08:00
Wei Kang	734bbd91dc	Add Python API for keyword spotting (#576 ) * Add alsa & microphone support for keyword spotting * Add python wrapper	2024-03-01 09:31:11 +08:00
Karel Vesely	38c072dcb2	Track token scores (#571 ) * add export of per-token scores (ys, lm, context) - for best path of the modified-beam-search decoding of transducer * refactoring JSON export of OnlineRecognitionResult, extending pybind11 API of OnlineRecognitionResult * export per-token scores also for greedy-search (online-transducer) - export un-scaled lm_probs (modified-beam search, online-transducer) - polishing * fill lm_probs/context_scores only if LM/ContextGraph is present (make Result smaller)	2024-02-29 06:28:45 +08:00
Askars	763a51486e	Add missing start_time to python API (#591 ) Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv>	2024-02-20 20:47:53 +08:00
Fangjun Kuang	44efff4e47	Fix CI tests for Python and JNI. (#554 )	2024-01-27 13:01:54 +08:00
chiiyeh	e7b18a2139	add blank_penalty for online transducer (#548 )	2024-01-26 12:12:13 +08:00
chiiyeh	466a6855c8	add hotwords docstring to offline_recognizer and online_recognizer (#546 )	2024-01-25 16:54:20 +08:00
chiiyeh	3bb3849ec5	add blank_penalty for offline transducer (#542 )	2024-01-25 15:00:09 +08:00
Wei Kang	b6c020901a	decoder for open vocabulary keyword spotting (#505 ) * various fixes to ContextGraph to support open vocabulary keywords decoder * Add keyword spotter runtime * Add binary * First version works * Minor fixes * update text2token * default values * Add jni for kws * add kws android project * Minor fixes * Remove unused interface * Minor fixes * Add workflow * handle extra info in texts * Minor fixes * Add more comments * Fix ci * fix cpp style * Add input box in android demo so that users can specify their keywords * Fix cpp style * Fix comments * Minor fixes * Minor fixes * minor fixes * Minor fixes * Minor fixes * Add CI * Fix code style * cpplint * Fix comments * Fix error	2024-01-20 22:52:41 +08:00
Fangjun Kuang	55266918c8	Add runtime support for wespeaker models (#516 )	2024-01-09 22:06:08 +08:00
Fangjun Kuang	e475e750ac	Support streaming zipformer CTC (#496 ) * Support streaming zipformer CTC * test online zipformer2 CTC * Update doc of sherpa-onnx.cc * Add Python APIs for streaming zipformer2 ctc * Add Python API examples for streaming zipformer2 ctc * Swift API for streaming zipformer2 CTC * NodeJS API for streaming zipformer2 CTC * Kotlin API for streaming zipformer2 CTC * Golang API for streaming zipformer2 CTC * C# API for streaming zipformer2 CTC * Release v1.9.6	2023-12-22 13:46:33 +08:00
Fangjun Kuang	0e23f82691	Give an informative log for whisper on exceptions. (#473 )	2023-12-08 14:33:59 +08:00
Fangjun Kuang	049fb9f451	Add Python APIs for WeNet CTC models (#428 )	2023-11-16 14:20:41 +08:00
Fangjun Kuang	655e0fa836	add python API and examples for TTS (#364 )	2023-10-14 14:21:53 +08:00
Peng He	4771c9275c	Add lm decode for the Python API. (#353 ) * Add lm decode for the Python API. * fix style. * Fix LogAdd, Shouldn't double lm_log_prob when merge same prefix path * sort the import alphabetically	2023-10-13 11:15:16 +08:00
Fangjun Kuang	407602445d	Add CTC HLG decoding using OpenFst (#349 )	2023-10-08 11:32:39 +08:00
Fangjun Kuang	33a5765169	Print a more user-friendly error message when using --hotwords-file. (#344 )	2023-09-26 11:04:20 +08:00
Fangjun Kuang	c471423125	Add Silero VAD (#313 )	2023-09-17 14:54:38 +08:00
Wei Kang	47184f9db7	Refactor hotwords，support loading hotwords from file (#296 )	2023-09-14 19:33:17 +08:00
Fangjun Kuang	f709c95c5f	Support multilingual whisper models (#274 )	2023-08-16 00:28:52 +08:00
Fangjun Kuang	6038e2aa62	Support streaming paraformer (#263 )	2023-08-14 10:32:14 +08:00
Fangjun Kuang	a4bff28e21	Support TDNN models from the yesno recipe from icefall (#262 )	2023-08-12 19:50:22 +08:00
Fangjun Kuang	b094868fb8	Add non-streaming websocket server for python (#259 )	2023-08-11 15:56:24 +08:00
Fangjun Kuang	79c2ce5dd4	Refactor online recognizer (#250 ) * Refactor online recognizer. Make it easier to support other streaming models. Note that it is a breaking change for the Python API. `sherpa_onnx.OnlineRecognizer()` used before should be replaced by `sherpa_onnx.OnlineRecognizer.from_transducer()`.	2023-08-09 20:27:31 +08:00
Fangjun Kuang	45b9d4ab37	Support whisper models (#238 )	2023-08-07 12:34:18 +08:00

1 2

69 Commits