enginex-mr_series-sherpa-onnx

EngineX-Iluvatar/enginex-mr_series-sherpa-onnx

Archived

Author	SHA1	Message	Date
Fangjun Kuang	0d44df9b67	Release v1.12.5 (#2368 )	2025-07-10 15:31:26 +08:00
Fangjun Kuang	fd9a687ec2	Add Pascal/Go/C#/Dart API for NeMo Canary ASR models (#2367 ) Add support for the new NeMo Canary ASR model across multiple language bindings by introducing a Canary model configuration and setter method on the offline recognizer. - Define Canary model config in Pascal, Go, C#, Dart and update converter functions - Add SetConfig API for offline recognizer (Pascal, Go, C#, Dart) - Extend CI/workflows and example scripts to test non-streaming Canary decoding	2025-07-10 14:53:33 +08:00
Askars Salimbajevs	f0960342ad	Add LODR support to online and offline recognizers (#2026 ) This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore. - Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id. - Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths. - Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.	2025-07-09 16:23:46 +08:00
Fangjun Kuang	6122a678f5	Refactor exporting NeMo models (#2362 ) Refactors and extends model export support to include new NeMo Parakeet TDT int8 variants for English and Japanese, updating the Kotlin API, export scripts, test runners, and CI workflows. - Added support for two new int8 model types in OfflineRecognizer.kt. - Enhanced Python export scripts to perform dynamic quantization and metadata injection. - Updated shell scripts and GitHub workflows to package, test, and publish int8 model artifacts.	2025-07-09 16:02:12 +08:00
Fangjun Kuang	103e93d9f6	Add Java and Kotlin API for NeMo Canary models (#2359 ) Add support for the NeMo Canary model in both Java and Kotlin APIs, wiring it through JNI and updating examples and CI. - Introduce OfflineCanaryModelConfig in Kotlin and Java with builder patterns - Extend OfflineRecognizer to accept and apply the new canary config via setConfig - Update JNI binding (GetOfflineConfig) and getOfflineModelConfig mapping (type 32), plus examples and CI workflows	2025-07-08 13:45:26 +08:00
Fangjun Kuang	df4615ca1d	Add C/CXX/JavaScript API for NeMo Canary models (#2357 ) This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs by adding new Canary configuration structures, updating bindings, extending examples, and enhancing CI workflows. - Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS). - Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime. - Update examples and CI scripts to demonstrate and test NeMo Canary model usage.	2025-07-07 23:38:04 +08:00
Fangjun Kuang	0e738c356c	Add C++ runtime and Python API for NeMo Canary models (#2352 )	2025-07-07 17:03:49 +08:00
Fangjun Kuang	c1e9e5c87f	Fix TTS for Unreal Engine (#2349 ) Unreal Engine has its own memory management, so we cannot return a struct containing a std::vector object.	2025-07-06 19:20:26 +08:00
Fangjun Kuang	e6b388067d	Release v1.12.4 (#2343 )	2025-07-04 19:41:02 +08:00
Fangjun Kuang	3bf986d08d	Support non-streaming zipformer CTC ASR models (#2340 ) This PR adds support for non-streaming Zipformer CTC ASR models across multiple language bindings, WebAssembly, examples, and CI workflows. - Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs - Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js - Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models Model doc is available at https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html	2025-07-04 15:57:07 +08:00
wenjie.Li	ef16455cb5	Add sherpa-onnx-streaming-zipformer-zh-int8-2025-06-30 to android ASR apk (#2336 )	2025-07-03 11:31:13 +08:00
Fangjun Kuang	9fe25cc06f	Fix VAD+ASR C++ example. (#2335 ) It was not able to handle short audios., e.g., 2.1 seconds.	2025-07-02 15:52:49 +08:00
Fangjun Kuang	e25634ac39	Release v1.12.3 (#2322 )	2025-06-27 10:55:46 +08:00
Fangjun Kuang	f835642b1c	Support Zipformer transducer ASR with whisper features. (#2321 ) Adds support for Zipformer transducer ASR models that use Whisper-style features by introducing a new feature flag, parsing metadata, and integrating per-chunk normalization. - Introduce UseWhisperFeature in the model interface and Zipformer implementation - Parse "feature" metadata to set the whisper flag and wire it into the recognizer - Update feature extraction logic to handle Whisper filterbanks with early returns	2025-06-27 10:40:41 +08:00
Fangjun Kuang	54bf3732d9	Support zipformer CTC ASR with whisper features. (#2319 )	2025-06-27 00:15:11 +08:00
Fangjun Kuang	056da0528d	Release v1.12.2 (#2314 )	2025-06-25 00:37:55 +08:00
Fangjun Kuang	bda427f4b2	Add API to get version information (#2309 )	2025-06-25 00:22:21 +08:00
Fangjun Kuang	6982b86c66	Support extra languages in multi-lang kokoro tts (#2303 )	2025-06-20 11:22:52 +08:00
Fangjun Kuang	a6095f5f64	Fix building for Pascal (#2305 )	2025-06-20 11:10:07 +08:00
Fangjun Kuang	59d118c256	Refactor kokoro export (#2302 ) - generate samples for https://k2-fsa.github.io/sherpa/onnx/tts/all/ - provide int8 model for kokoro v0.19 kokoro-int8-en-v0_19.tar.bz2	2025-06-18 20:30:10 +08:00
Fangjun Kuang	3878170991	Fixes #2172 (#2301 ) Handle the case when the input audio contains no speeches.	2025-06-18 16:48:48 +08:00
Fangjun Kuang	2913cce77c	Add scripts for exporting Piper TTS models to sherpa-onnx (#2299 )	2025-06-17 14:23:39 +08:00
GlocKieHuan	a135324c8c	Fix isspace on windows in debug build (#2042 )	2025-06-09 10:27:16 +08:00
Fangjun Kuang	d57e4f84de	Add Python API for source separation (#2283 )	2025-06-05 20:44:26 +08:00
Fangjun Kuang	1fabc6c79a	Fix rknn for multi-threads (#2274 )	2025-06-03 20:28:57 +08:00
Fangjun Kuang	2b2788332e	Add C++ support for UVR models (#2269 )	2025-06-01 17:22:08 +08:00
mtdxc	e0ca224b76	fixed mfc build error (#2267 ) Co-authored-by: cqm <cqm@97kid.com>	2025-05-31 23:32:35 +08:00
mtdxc	613e8084c2	move portaudio common record code to microphone (#2264 ) Co-authored-by: cqm <cqm@97kid.com>	2025-05-31 21:48:41 +08:00
Fangjun Kuang	8e6826521e	Update kaldi-native-fbank. (#2259 ) Now it supports FFT of an even number, not necessarily a power of 2.	2025-05-29 10:34:22 +08:00
Fangjun Kuang	16a3449945	Build APK with replace.fst (#2254 )	2025-05-28 12:19:29 +08:00
Skepller	640ceb5513	JAVA-API: Manual Library Loading Support for Restricted Environments (#2253 ) * feat: Added LibraryLoader that allows loading to be skipped * feat: Changed static call to new LibraryLoader * feat: Makefile adjustment	2025-05-28 06:13:39 +08:00
yegyu	2107afdbd4	Add include headers for __ANDROID_API__,__OHOS__ (#2251 )	2025-05-27 14:44:06 +08:00
Fangjun Kuang	716ba8317b	Add C++ runtime for spleeter about source separation (#2242 )	2025-05-23 22:30:57 +08:00
Fangjun Kuang	ff6f3b17ac	Use jlong explicitly in jni. (#2229 )	2025-05-20 15:29:47 +08:00
Fangjun Kuang	d8bb20710d	Add script to build APK for simulated-streaming-asr. (#2220 )	2025-05-15 15:40:22 +08:00
esavin	aeb311db50	Expose dither for JNI (#2215 )	2025-05-14 23:38:25 +08:00
Fangjun Kuang	2e9e0b4e9e	Add Android demo for real-time ASR with non-streaming ASR models. (#2214 )	2025-05-14 19:10:44 +08:00
Fangjun Kuang	0dfafed7d0	Support homophone replacer in Android asr demo. (#2210 )	2025-05-14 10:58:35 +08:00
Fangjun Kuang	9a0e16f092	Support sending is_eof for online websocket server. (#2204 ) is_final=true means an endpoint is detected. is_eof=true means all received samples have been processed by the server.	2025-05-13 14:49:22 +08:00
Fangjun Kuang	028b8f2718	Add C++ example for streaming ASR with SenseVoice. (#2199 )	2025-05-11 00:23:32 +08:00
Fangjun Kuang	53518efd2f	Add real-time speech recognition example for SenseVoice. (#2197 )	2025-05-10 00:50:40 +08:00
Fangjun Kuang	4a833a7547	Fix displaying streaming speech recognition results for Python. (#2196 )	2025-05-09 21:48:49 +08:00
Fangjun Kuang	a6834f6556	Show verbose logs in homophone replacer (#2194 )	2025-05-09 10:48:30 +08:00
Fangjun Kuang	562a5f7d9b	Fix building wheels for macOS (#2192 )	2025-05-08 19:15:33 +08:00
Fangjun Kuang	f9c99032c3	Avoid NaN in feature normalization. (#2186 )	2025-05-08 11:22:47 +08:00
Fangjun Kuang	f00066db88	Add C++ runtime for parakeet-tdt-0.6b-v2. (#2181 )	2025-05-06 16:59:01 +08:00
Fangjun Kuang	e537094b07	Add Kotlin and Java API for homophone replacer (#2166 ) * Add Kotlin API for homonphone replacer * Add Java API for homonphone replacer	2025-04-29 22:55:21 +08:00
Fangjun Kuang	4a7a974a04	More fix for building without tts (#2162 )	2025-04-29 16:31:31 +08:00
Fangjun Kuang	e51c37eb2f	Add C and CXX API for homophone replacer (#2156 )	2025-04-27 22:09:13 +08:00
Fangjun Kuang	f64c58342b	Support replacing homonphonic phrases (#2153 )	2025-04-27 15:31:11 +08:00

1 2 3 4 5 ...

641 Commits