enginex-mr_series-sherpa-onnx

EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

Author	SHA1	Message	Date
Fangjun Kuang	f1b311ee4f	Handle audio files less than 10s long for speaker diarization. (#1412 ) If the input audio file is less than 10 seconds long, there is only one chunk, and there is no need to compute embeddings or do clustering. We can use the segmentation result from the speaker segmentation model directly.	2024-10-11 10:27:16 +08:00
Fangjun Kuang	1d061df355	WebAssembly exmaple for speaker diarization (#1411 )	2024-10-10 22:14:45 +08:00
Fangjun Kuang	d468527f62	C API for speaker diarization (#1402 )	2024-10-09 17:10:03 +08:00
Fangjun Kuang	8535b1d3bb	Python API for speaker diarization. (#1400 )	2024-10-09 14:13:26 +08:00
Fangjun Kuang	59407edcad	C++ API for speaker diarization (#1396 )	2024-10-09 12:01:20 +08:00
Fangjun Kuang	70165cb42d	Speaker diarization example with onnxruntime Python API (#1395 )	2024-10-06 16:37:29 +08:00
Askars	5f50cbf65a	context_state is not set correctly when previous context is passed after reset (#1393 ) Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv>	2024-10-03 16:42:09 +08:00
Fangjun Kuang	b965f14cf0	Add Python API for clustering (#1385 )	2024-09-30 11:33:15 +08:00
Fangjun Kuang	70568c2df7	Support Agglomerative clustering. (#1384 ) We use the open-source implementation from https://github.com/cdalitz/hclust-cpp	2024-09-29 23:44:29 +08:00
Fangjun Kuang	11f0cb7e1c	Support Parakeet models from NeMo (#1381 )	2024-09-27 17:12:00 +08:00
lxiao336	06b61ccad8	Allow more online models to load tokens file from the memory (#1352 ) Co-authored-by: xiao <shawl336@6163.com>	2024-09-20 16:38:41 +08:00
Fangjun Kuang	576a3aa90d	Add non-streaming ONNX models for Russian ASR (#1358 )	2024-09-18 13:43:49 +08:00
Fangjun Kuang	e7ffcbd677	Add APIs about max speech duration in VAD for various programming languages (#1349 )	2024-09-14 12:30:13 +08:00
Fangjun Kuang	1423ddb1f0	Support specifying max speech duration for VAD. (#1348 )	2024-09-14 10:57:46 +08:00
Fangjun Kuang	544857b097	Fix building (#1343 )	2024-09-13 13:33:52 +08:00
lxiao336	65cfa7548a	re-pull-request allow tokens and hotwords be loaded from buffered string driectly (#1339 ) Co-authored-by: xiao <shawl336@163.com>	2024-09-13 09:58:17 +08:00
Fangjun Kuang	6b6e7635ed	Fix computing features for CED audio tagging models. (#1341 ) See also https://github.com/RicherMans/CED/blob/main/onnx_inference_with_kaldi.py	2024-09-12 19:38:18 +08:00
Askars	fa20ae1552	Preserve previous result as context for next segment (#1335 ) Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv>	2024-09-11 10:44:13 +08:00
Fangjun Kuang	ba7f1a7439	Fix building (#1331 )	2024-09-09 10:29:31 +08:00
Lim Yao Chong	3bffc24d64	Add Python binding for online punctuation models (#1312 )	2024-09-09 10:26:53 +08:00
Fangjun Kuang	363b8e4c1e	Fix vad.Flush(). (#1329 ) Fixes #1314	2024-09-08 17:52:53 +08:00
SilverSulfide	888f74bf3c	Re-implement LM rescore for online transducer (#1231 ) Co-authored-by: Martins Kronis <martins.kuznecovs@tilde.lv>	2024-09-06 10:01:25 +08:00
RGdevz	1f29e4a1a9	throw error instead exit (#1323 )	2024-09-06 09:59:21 +08:00
Fangjun Kuang	3687c9f60a	Reduce onnxruntime log output. (#1306 ) Change the logging level from WARNING to ERROR.	2024-08-30 12:50:34 +08:00
Fangjun Kuang	ca30d83915	Avoid SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches freeing null. (#1296 ) Fixes #1295	2024-08-28 10:42:36 +08:00
Fangjun Kuang	537e163dd0	WebAssembly example for VAD + Non-streaming ASR (#1284 )	2024-08-24 13:24:52 +08:00
Malcolm Ke Win	c61423ec5a	Update wave-reader.cc (#1278 ) * Update wave-reader.cc missing "#include <cstdint>"	2024-08-22 23:22:45 +08:00
Robin Zhong	d8001d6edc	update kotlin api for better release native object and add user-friendly apis. (#1275 )	2024-08-22 19:18:11 +08:00
Fangjun Kuang	5a2aa110b8	Text to speech API for Object Pascal. (#1273 )	2024-08-20 20:52:16 +08:00
Fangjun Kuang	e34a1a2aa3	Object pascal examples for recording and playing audio with portaudio. (#1271 ) The recording example can be used for speech recognition while the playing example can be used for text to speech. The portaudio wrapper for object pascal is copied from https://github.com/UltraStar-Deluxe/USDX/blob/master/src/lib/portaudio/portaudio.pas	2024-08-18 19:51:08 +08:00
Fangjun Kuang	f93f0ca94d	Use a separate thread to initialize models for lazarus examples. (#1270 ) So that the main thread is not blocked and the user interface is responsive.	2024-08-18 14:59:48 +08:00
Fangjun Kuang	88809753ab	Release v1.10.22 (#1267 )	2024-08-16 22:40:49 +08:00
Fangjun Kuang	9dcea49dba	Fix looking up OOVs in lexicon.txt for MeloTTS models. (#1266 ) If an English word does not exist in the lexicon, we split it into characters. For instance, if the word TTS does not exist in lexicon.txt, we split it into 3 characters T, T, and S.	2024-08-16 22:10:03 +08:00
Ikko Eltociear Ashimine	a3e98750e9	chore: update online-stream.h (#1264 ) Fix typos.	2024-08-16 15:17:15 +08:00
Fangjun Kuang	fbe35ba736	Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR (#1251 )	2024-08-15 22:19:45 +08:00
Fangjun Kuang	ca729faebf	Support reading multi-channel wave files with 8/16/32-bit encoded samples (#1258 )	2024-08-15 14:54:43 +08:00
Robin Zhong	62c4d4ab62	Add emotion, event of SenseVoice. (#1257 ) * Add emotion, event of SenseVoice. * Fix tokens size check and update java api. https://github.com/k2-fsa/sherpa-onnx/pull/1257	2024-08-14 15:50:13 +08:00
ivan provalov	9f06b059d7	Update offline-recognizer.cc (#1253 ) Adding setConfig method to JNI to support setting a config on the previously initialized offline-recognizer.	2024-08-13 23:04:51 +08:00
Fangjun Kuang	619279b162	Pascal API for VAD (#1249 )	2024-08-13 16:16:51 +08:00
Fangjun Kuang	a7dc6c2c16	Pascal API for non-streaming ASR (#1247 )	2024-08-12 23:33:35 +08:00
Fangjun Kuang	5791b695ea	Pascal API for streaming ASR (#1246 )	2024-08-12 19:55:51 +08:00
Fangjun Kuang	65f1c0fab2	Add Pascal API for reading wave files (#1243 )	2024-08-11 22:43:42 +08:00
Fangjun Kuang	94e256244d	Add blank penalty for various language bindings. (#1234 )	2024-08-08 10:43:31 +08:00
Parth Khiera	ba4cb6169f	feat: addition of blank_penalty config in online_recognizer (#1232 )	2024-08-08 09:10:17 +08:00
Fangjun Kuang	8a5f5c1999	Fix python two pass ASR examples (#1230 )	2024-08-07 18:35:38 +08:00
xsjk	1da75ee3c0	Fix typo in offline-lm-config.cc (#1229 )	2024-08-07 15:38:34 +08:00
Fangjun Kuang	375c055ff8	Fix style issues for online punctuation source files (#1225 )	2024-08-06 17:43:24 +08:00
jianyou	1414e4dc61	Add online punctuation and casing prediction model for English language (#1224 )	2024-08-06 17:33:38 +08:00
Fangjun Kuang	9caa488019	Fix setting SenseVoice language. (#1214 )	2024-08-04 19:02:23 +08:00
Fangjun Kuang	d5f486878d	Remove libonnxruntime_providers_cuda.so as a dependency. (#1210 )	2024-08-03 16:25:23 +08:00

... 2 3 4 5 6 ...

602 Commits