Commit Graph

43 Commits

Author SHA1 Message Date
Wei Kang
734bbd91dc Add Python API for keyword spotting (#576)
* Add alsa & microphone support for keyword spotting

* Add python wrapper
2024-03-01 09:31:11 +08:00
Karel Vesely
38c072dcb2 Track token scores (#571)
* add export of per-token scores (ys, lm, context)

- for best path of the modified-beam-search decoding of transducer

* refactoring JSON export of OnlineRecognitionResult, extending pybind11 API of OnlineRecognitionResult

* export per-token scores also for greedy-search (online-transducer)

- export un-scaled lm_probs (modified-beam search, online-transducer)
- polishing

* fill lm_probs/context_scores only if LM/ContextGraph is present (make Result smaller)
2024-02-29 06:28:45 +08:00
Askars
763a51486e Add missing start_time to python API (#591)
Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv>
2024-02-20 20:47:53 +08:00
Fangjun Kuang
44efff4e47 Fix CI tests for Python and JNI. (#554) 2024-01-27 13:01:54 +08:00
chiiyeh
e7b18a2139 add blank_penalty for online transducer (#548) 2024-01-26 12:12:13 +08:00
chiiyeh
466a6855c8 add hotwords docstring to offline_recognizer and online_recognizer (#546) 2024-01-25 16:54:20 +08:00
chiiyeh
3bb3849ec5 add blank_penalty for offline transducer (#542) 2024-01-25 15:00:09 +08:00
Wei Kang
b6c020901a decoder for open vocabulary keyword spotting (#505)
* various fixes to ContextGraph to support open vocabulary keywords decoder

* Add keyword spotter runtime

* Add binary

* First version works

* Minor fixes

* update text2token

* default values

* Add jni for kws

* add kws android project

* Minor fixes

* Remove unused interface

* Minor fixes

* Add workflow

* handle extra info in texts

* Minor fixes

* Add more comments

* Fix ci

* fix cpp style

* Add input box in android demo so that users can specify their keywords

* Fix cpp style

* Fix comments

* Minor fixes

* Minor fixes

* minor fixes

* Minor fixes

* Minor fixes

* Add CI

* Fix code style

* cpplint

* Fix comments

* Fix error
2024-01-20 22:52:41 +08:00
Fangjun Kuang
55266918c8 Add runtime support for wespeaker models (#516) 2024-01-09 22:06:08 +08:00
Fangjun Kuang
e475e750ac Support streaming zipformer CTC (#496)
* Support streaming zipformer CTC

* test online zipformer2 CTC

* Update doc of sherpa-onnx.cc

* Add Python APIs for streaming zipformer2 ctc

* Add Python API examples for streaming zipformer2 ctc

* Swift API for streaming zipformer2 CTC

* NodeJS API for streaming zipformer2 CTC

* Kotlin API for streaming zipformer2 CTC

* Golang API for streaming zipformer2 CTC

* C# API for streaming zipformer2 CTC

* Release v1.9.6
2023-12-22 13:46:33 +08:00
Fangjun Kuang
0e23f82691 Give an informative log for whisper on exceptions. (#473) 2023-12-08 14:33:59 +08:00
Fangjun Kuang
049fb9f451 Add Python APIs for WeNet CTC models (#428) 2023-11-16 14:20:41 +08:00
Fangjun Kuang
655e0fa836 add python API and examples for TTS (#364) 2023-10-14 14:21:53 +08:00
Peng He
4771c9275c Add lm decode for the Python API. (#353)
* Add lm decode for the Python API.

* fix style.

* Fix LogAdd,

	Shouldn't double lm_log_prob when merge same prefix path

* sort the import alphabetically
2023-10-13 11:15:16 +08:00
Fangjun Kuang
407602445d Add CTC HLG decoding using OpenFst (#349) 2023-10-08 11:32:39 +08:00
Fangjun Kuang
33a5765169 Print a more user-friendly error message when using --hotwords-file. (#344) 2023-09-26 11:04:20 +08:00
Fangjun Kuang
c471423125 Add Silero VAD (#313) 2023-09-17 14:54:38 +08:00
Wei Kang
47184f9db7 Refactor hotwords,support loading hotwords from file (#296) 2023-09-14 19:33:17 +08:00
Fangjun Kuang
f709c95c5f Support multilingual whisper models (#274) 2023-08-16 00:28:52 +08:00
Fangjun Kuang
6038e2aa62 Support streaming paraformer (#263) 2023-08-14 10:32:14 +08:00
Fangjun Kuang
a4bff28e21 Support TDNN models from the yesno recipe from icefall (#262) 2023-08-12 19:50:22 +08:00
Fangjun Kuang
b094868fb8 Add non-streaming websocket server for python (#259) 2023-08-11 15:56:24 +08:00
Fangjun Kuang
79c2ce5dd4 Refactor online recognizer (#250)
* Refactor online recognizer.

Make it easier to support other streaming models.

Note that it is a breaking change for the Python API.
`sherpa_onnx.OnlineRecognizer()` used before should be
replaced by `sherpa_onnx.OnlineRecognizer.from_transducer()`.
2023-08-09 20:27:31 +08:00
Fangjun Kuang
45b9d4ab37 Support whisper models (#238) 2023-08-07 12:34:18 +08:00
Wilson Wongso
5a6b55c5a7 Reduce model initialization time for online speech recognition (#215)
* Reduce model initialization time for online speech recognition

* Fixed Styling

---------

Co-authored-by: w11wo <wilsowong961@gmail.com>
2023-07-14 21:20:10 +08:00
Fangjun Kuang
f3206c49dc Reduce model initialization time for offline speech recognition (#213) 2023-07-14 18:07:27 +08:00
Fangjun Kuang
5cd72ba3aa Fix setting context lists. (#207) 2023-07-12 09:18:56 +08:00
Wilson Wongso
b2364b0374 Implemented tokens and timestamps in Python API (#205) 2023-07-12 09:12:31 +08:00
Fangjun Kuang
33bf8dc1f4 Support specifying providers in Python API (#198) 2023-07-06 10:14:01 +08:00
Wei Kang
513dfaa552 Support contextual-biasing for streaming model (#184)
* Support contextual-biasing for streaming model

* The whole pipeline runs normally

* Fix comments
2023-06-30 16:46:24 +08:00
Wei Kang
8562711252 Implement context biasing with a Aho Corasick automata (#145)
* Implement context graph

* Modify the interface to support context biasing

* Support context biasing in modified beam search; add python wrapper

* Support context biasing in python api example

* Minor fixes

* Fix context graph

* Minor fixes

* Fix tests

* Fix style

* Fix style

* Fix comments

* Minor fixes

* Add missing header

* Replace std::shared_ptr with std::unique_ptr for effciency

* Build graph in constructor

* Fix comments

* Minor fixes

* Fix docs
2023-06-16 14:26:36 +08:00
Fangjun Kuang
5e2dc5ceea add streaming-server with web client (#164)
* add streaming-server with web client

* small fixes
2023-05-30 22:46:52 +08:00
Fangjun Kuang
cea718e3d8 Support CoreML for macOS (#151) 2023-05-12 15:57:44 +08:00
Fangjun Kuang
80060c276d Begin to support CTC models (#119)
Please see https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/index.html for a list of pre-trained CTC models from NeMo.
2023-04-07 23:11:34 +08:00
KajiMaCN
7f7e3680c3 Modify the rule attribute data type of OnlineRecognizer (#113) 2023-04-04 15:42:56 +08:00
Fangjun Kuang
5d3c8edbc9 add python tests (#111) 2023-04-02 23:05:30 +08:00
manyeyes
3f7e0c23ac adding a python api for offline decode (#110) 2023-04-02 13:17:43 +08:00
Fangjun Kuang
5572246253 Add non-streaming ASR (#92) 2023-03-26 08:53:42 +08:00
Fangjun Kuang
7f72c13d9a Code refactoring (#74)
* Don't reset model state and feature extractor on endpointing

* support passing decoding_method from commandline

* Add modified_beam_search to Python API

* fix C API example

* Fix style issues
2023-03-03 12:10:59 +08:00
Fangjun Kuang
e4b79ad34b Add Python websocket client (#63) 2023-02-24 22:46:30 +08:00
Fangjun Kuang
9064b3f016 Support Android (#59) 2023-02-24 13:57:03 +08:00
Fangjun Kuang
124384369a Add endpointing (#54) 2023-02-22 15:35:55 +08:00
Fangjun Kuang
ea09d5fbc5 Add Python API (#31) 2023-02-19 19:36:03 +08:00