Commit Graph

32 Commits

Author SHA1 Message Date
Fangjun Kuang
b3e05f6dc4 Fix style issues (#1458) 2024-10-24 11:15:08 +08:00
Fangjun Kuang
994c3e7c96 Add VAD + Non-streaming ASR example for JavaScript API. (#1170) 2024-07-26 12:42:08 +08:00
Fangjun Kuang
a11c859971 Support clang-tidy (#1034) 2024-06-19 20:51:57 +08:00
Fangjun Kuang
349d957da2 Add inverse text normalization for online ASR (#1020) 2024-06-17 18:39:23 +08:00
Fangjun Kuang
1a43d1e37f Support getting word IDs for CTC HLG decoding. (#978) 2024-06-06 14:22:39 +08:00
Wei Kang
b012b78ceb Encode hotwords in C++ side (#828)
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Karel Vesely
2e45d327a5 Adding temperature scaling on Joiner logits: (#789)
* Adding temperature scaling on Joiner logits:

- T hard-coded to 2.0
- so far best result NCE 0.122 (still not so high)
    - the BPE scores were rescaled with 0.2 (but then also incorrect words
      get high confidence, visually reasonable histograms are for 0.5 scale)
    - BPE->WORD score merging done by min(.) function
      (tried also prob-product, and also arithmetic, geometric, harmonic mean)

- without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best)

Results seem consistent with: https://arxiv.org/abs/2110.15222

Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model.

I also experimented with blank posteriors mixed into the BPE confidences,
but no NCE improvement found, so not pushing that.

Temperature scling added also to the Greedy search confidences.

* making `temperature_scale` configurable from outside
2024-04-26 09:44:26 +08:00
Manix
fb4aee83ac Adding warm up for Zipformer2 (#766)
Signed-off-by: manickavela1998@gmail.com <manickavela1998@gmail.com>
2024-04-16 09:16:55 +08:00
Fangjun Kuang
db67e00c77 Add HLG decoding for streaming CTC models (#731) 2024-04-03 21:31:42 +08:00
Karel Vesely
38c072dcb2 Track token scores (#571)
* add export of per-token scores (ys, lm, context)

- for best path of the modified-beam-search decoding of transducer

* refactoring JSON export of OnlineRecognitionResult, extending pybind11 API of OnlineRecognitionResult

* export per-token scores also for greedy-search (online-transducer)

- export un-scaled lm_probs (modified-beam search, online-transducer)
- polishing

* fill lm_probs/context_scores only if LM/ContextGraph is present (make Result smaller)
2024-02-29 06:28:45 +08:00
chiiyeh
e7b18a2139 add blank_penalty for online transducer (#548) 2024-01-26 12:12:13 +08:00
Fangjun Kuang
33a5765169 Print a more user-friendly error message when using --hotwords-file. (#344) 2023-09-26 11:04:20 +08:00
Fangjun Kuang
552a267c23 Set is_final and start_time for online websocket server. (#342)
* Set is_final and start_time for online websocket server.

* Convert timestamps to a json array
2023-09-25 15:12:07 +08:00
Wei Kang
47184f9db7 Refactor hotwords,support loading hotwords from file (#296) 2023-09-14 19:33:17 +08:00
Fangjun Kuang
79c2ce5dd4 Refactor online recognizer (#250)
* Refactor online recognizer.

Make it easier to support other streaming models.

Note that it is a breaking change for the Python API.
`sherpa_onnx.OnlineRecognizer()` used before should be
replaced by `sherpa_onnx.OnlineRecognizer.from_transducer()`.
2023-08-09 20:27:31 +08:00
Wei Kang
513dfaa552 Support contextual-biasing for streaming model (#184)
* Support contextual-biasing for streaming model

* The whole pipeline runs normally

* Fix comments
2023-06-30 16:46:24 +08:00
keanu
1a1b9fd236 RNNLM model support lm_num_thread and lm_provider setting (#173)
* rnnlm model inference supports num_threads setting

* rnnlm params decouple num_thread and provider with Transducer.

* fix python csrc bug which offline-lm-config.cc and online-lm-config.cc arguments problem

* lm_num_threads and lm_provider set default values

---------

Co-authored-by: cuidongcai1035 <cuidongcai1035@wezhuiyi.com>
2023-06-12 15:51:27 +08:00
keanu
9c017c2ccb rnnlm model inference supports num_threads setting (#169)
Co-authored-by: cuidongcai1035 <cuidongcai1035@wezhuiyi.com>
2023-06-07 09:32:27 +08:00
Fangjun Kuang
cea718e3d8 Support CoreML for macOS (#151) 2023-05-12 15:57:44 +08:00
Jingzhao Ou
0992063de8 Stack and streaming conformer support (#141)
* added csrc/stack.cc

* stack: added checks

* added copyright info

* passed cpp style checks

* formatted code

* added some support for streaming conformer model support (not verified)

* code lint

* made more progress with streaming conformer support (not working yet)

* passed style check

* changes as suggested by @csukuangfj

* added some debug info

* fixed style check

* Use Cat to replace Stack

* remove debug statements

---------

Co-authored-by: Jingzhao Ou (jou2019) <jou2019@cisco.com>
Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
2023-05-11 14:30:39 +08:00
PF Luo
8c6a6768d5 Add lm rescore to online-modified-beam-search (#133) 2023-05-05 21:23:54 +08:00
Fangjun Kuang
ad05f52666 Add timestamps for streaming ASR. (#123) 2023-04-19 16:02:37 +08:00
Fangjun Kuang
7f72c13d9a Code refactoring (#74)
* Don't reset model state and feature extractor on endpointing

* support passing decoding_method from commandline

* Add modified_beam_search to Python API

* fix C API example

* Fix style issues
2023-03-03 12:10:59 +08:00
PF Luo
5326d0f81f add modified beam search (#69) 2023-03-01 15:32:54 +08:00
Fangjun Kuang
475caf22f9 Add iOS support (#65) 2023-02-25 21:56:25 +08:00
Fangjun Kuang
40522f037b add streaming websocket server and client (#62) 2023-02-24 21:39:51 +08:00
Fangjun Kuang
9064b3f016 Support Android (#59) 2023-02-24 13:57:03 +08:00
Fangjun Kuang
5a5d029490 Add build script for Android armv8a (#58) 2023-02-22 22:36:05 +08:00
Fangjun Kuang
124384369a Add endpointing (#54) 2023-02-22 15:35:55 +08:00
Fangjun Kuang
ea09d5fbc5 Add Python API (#31) 2023-02-19 19:36:03 +08:00
Fangjun Kuang
8acc059b3f Support batch greedy search decoding (#30) 2023-02-19 15:04:24 +08:00
Fangjun Kuang
ebc3b47fb8 add online-recognizer (#29) 2023-02-19 12:45:38 +08:00