enginex_bi_series-sherpa-onnx

EngineX-Iluvatar/enginex_bi_series-sherpa-onnx

Archived

Author	SHA1	Message	Date
Fangjun Kuang	e2b2d5ea57	Add CXX examples for NeMo TDT ASR. (#2363 ) # New Features - Added new example programs demonstrating streaming speech recognition from a microphone using Parakeet-TDT CTC and Zipformer Transducer models with voice activity detection. - These examples support microphone input via PortAudio and display recognized text incrementally. # Bug Fixes - Improved error handling and logic when opening microphone devices in several example programs for more reliable device initialization. # Chores - Updated build configuration to include new executable examples when PortAudio support is enabled.	2025-07-09 18:30:42 +08:00
Fangjun Kuang	df4615ca1d	Add C/CXX/JavaScript API for NeMo Canary models (#2357 ) This PR introduces support for NeMo Canary models across C, C++, and JavaScript APIs by adding new Canary configuration structures, updating bindings, extending examples, and enhancing CI workflows. - Add OfflineCanaryModelConfig to all language bindings (C, C++, JS, ETS). - Implement SetConfig methods and NAPI wrappers for updating recognizer config at runtime. - Update examples and CI scripts to demonstrate and test NeMo Canary model usage.	2025-07-07 23:38:04 +08:00
Fangjun Kuang	3bf986d08d	Support non-streaming zipformer CTC ASR models (#2340 ) This PR adds support for non-streaming Zipformer CTC ASR models across multiple language bindings, WebAssembly, examples, and CI workflows. - Introduces a new OfflineZipformerCtcModelConfig in C/C++, Python, Swift, Java, Kotlin, Go, Dart, Pascal, and C# APIs - Updates initialization, freeing, and recognition logic to include Zipformer CTC in WASM and Node.js - Adds example scripts and CI steps for downloading, building, and running Zipformer CTC models Model doc is available at https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html	2025-07-04 15:57:07 +08:00
Fangjun Kuang	749dc9a239	Release v1.12.1 (#2277 )	2025-06-03 21:55:49 +08:00
Fangjun Kuang	2b2788332e	Add C++ support for UVR models (#2269 )	2025-06-01 17:22:08 +08:00
mtdxc	613e8084c2	move portaudio common record code to microphone (#2264 ) Co-authored-by: cqm <cqm@97kid.com>	2025-05-31 21:48:41 +08:00
Fangjun Kuang	fdda292d5a	Add alsa-based streaming ASR example for sense voice. (#2207 )	2025-05-13 19:08:09 +08:00
Fangjun Kuang	b269e5cccc	Add C++ example for real-time ASR with nvidia/parakeet-tdt-0.6b-v2. (#2201 )	2025-05-11 16:30:38 +08:00
Fangjun Kuang	028b8f2718	Add C++ example for streaming ASR with SenseVoice. (#2199 )	2025-05-11 00:23:32 +08:00
Fangjun Kuang	e51c37eb2f	Add C and CXX API for homophone replacer (#2156 )	2025-04-27 22:09:13 +08:00
Fangjun Kuang	31ced58f9a	Release v1.11.3 (#2097 )	2025-04-03 16:19:01 +08:00
Fangjun Kuang	2dc0f91904	Add C# API for Dolphin CTC models (#2089 )	2025-04-02 23:36:22 +08:00
Fangjun Kuang	da4aad1189	Add C and CXX API for Dolphin CTC models (#2088 )	2025-04-02 21:54:20 +08:00
Fangjun Kuang	0703bc1b86	Add CXX API for VAD (#2077 )	2025-04-01 14:51:43 +08:00
niansa/tuxifan	9d23606ee6	Allow building repository as CMake subdirectory (#2059 ) * Use PROJECT_SOURCE_DIR rather than CMAKE_SOURCE_DIR to allow building as subdirectory * Also use PROJECT_SOURCE_DIR instead of CMAKE_SOURCE_DIR in c/cxx api examples * Only build examples by default when not building as subdirectory * Do not suggest building binaries either --------- Co-authored-by: user <user@mail.tld>	2025-03-29 06:27:59 +08:00
Fangjun Kuang	0aacf02dd8	Add C++ runtime for vocos (#2014 )	2025-03-17 17:05:15 +08:00
Fangjun Kuang	802119db17	Add CXX API for speech enhancement GTCRN models (#1986 )	2025-03-11 17:07:52 +08:00
Fangjun Kuang	1d49dd2fb0	Add CXX API for FireRedAsr (#1872 )	2025-02-17 11:46:13 +08:00
Fangjun Kuang	e2e0f25100	Add Swift API for Kokoro TTS 1.0 (#1803 )	2025-02-07 15:06:34 +08:00
Fangjun Kuang	d815204774	Add CXX API for Kokoro TTS 1.0 (#1802 )	2025-02-07 14:51:49 +08:00
Fangjun Kuang	c84a833863	Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795 )	2025-02-06 22:57:13 +08:00
Fangjun Kuang	8b989a851c	Fix keyword spotting. (#1689 ) Reset the stream right after detecting a keyword	2025-01-20 16:41:10 +08:00
Fangjun Kuang	af671e2b63	Add C API for Kokoro TTS models (#1717 )	2025-01-16 15:07:26 +08:00
Fangjun Kuang	648903834b	Add CXX API for MatchaTTS models (#1676 )	2025-01-03 14:16:36 +08:00
Fangjun Kuang	f0cced1f37	Publish pre-built wheels with CUDA support for Linux aarch64. (#1507 )	2024-11-03 19:15:11 +08:00
Fangjun Kuang	c5205f08bf	Add an example for computing RTF about streaming ASR. (#1501 )	2024-11-01 11:40:13 +08:00
Fangjun Kuang	2ca2985d04	Add C and C++ API for Moonshine models (#1476 )	2024-10-26 23:24:46 +08:00
Fangjun Kuang	a5295aad10	Handle NaN embeddings in speaker diarization. (#1461 ) See also https://github.com/thewh1teagle/sherpa-rs/issues/33	2024-10-24 14:03:09 +08:00
Fangjun Kuang	ceb69ebd94	Add C++ API for non-streaming ASR (#1456 )	2024-10-23 16:40:12 +08:00
Fangjun Kuang	effd5ef2be	Add C++ API for streaming ASR. (#1455 ) It is a wrapper around the C API.	2024-10-23 12:07:43 +08:00

30 Commits