Commit Graph

19 Commits

Author SHA1 Message Date
Fangjun Kuang
b76cd9033a Support decoding with byte-level BPE (bbpe) models. (#1633) 2024-12-20 19:21:32 +08:00
Fangjun Kuang
298b6b6fda Add non-streaming ASR support for HarmonyOS. (#1564) 2024-11-26 16:38:35 +08:00
ivan provalov
de04b3b9bf Allow modify model config at decode time for ASR (#1124) 2024-07-13 22:30:47 +08:00
Zhong-Yi Li
675fb1574f offline transducer: treat unk as blank (#1005)
Co-authored-by: chungyi.li <chungyi.li@ailabs.tw>
2024-06-19 20:52:42 +08:00
Fangjun Kuang
b0f7ed3ee3 Add inverse text normalization for non-streaming ASR (#1017) 2024-06-17 14:28:53 +08:00
Wei Kang
a38881817c Support customize scores for hotwords (#926)
* Support customize scores for hotwords

* Skip blank lines
2024-05-31 12:34:30 +08:00
Wei Kang
b012b78ceb Encode hotwords in C++ side (#828)
* Encode hotwords in C++ side
2024-05-20 19:41:36 +08:00
Fangjun Kuang
6bf2099781 Fix code style issues (#774) 2024-04-16 09:46:15 +08:00
Karel Vesely
3f2a17ef47 Fixes issue #535 , fix hexa 1-char tokens in ASR output. (#550)
- Avoid output like : `[' K', '<0x64>', '<0x79>', 'ť', ' a', '<0x75>',
  'to', 'bu', '<0x73>', '<0x75>', ... ]` with regular 500 BPE units.
- Don't rewrite 1-char tokens in range [ 0x20 (space) .. 0x7E (tilde) ]
2024-01-26 19:23:20 +08:00
chiiyeh
3bb3849ec5 add blank_penalty for offline transducer (#542) 2024-01-25 15:00:09 +08:00
Fangjun Kuang
e215d0c39a Fix Byte BPE string results for Python. (#512)
It ignores invalid UTF8 strings.
2024-01-03 16:03:24 +08:00
Wei Kang
47184f9db7 Refactor hotwords,support loading hotwords from file (#296) 2023-09-14 19:33:17 +08:00
Fangjun Kuang
debab7c091 Add two-pass speech recognition Android/iOS demo (#304) 2023-09-12 15:40:16 +08:00
Wei Kang
8562711252 Implement context biasing with a Aho Corasick automata (#145)
* Implement context graph

* Modify the interface to support context biasing

* Support context biasing in modified beam search; add python wrapper

* Support context biasing in python api example

* Minor fixes

* Fix context graph

* Minor fixes

* Fix tests

* Fix style

* Fix style

* Fix comments

* Minor fixes

* Add missing header

* Replace std::shared_ptr with std::unique_ptr for effciency

* Build graph in constructor

* Fix comments

* Minor fixes

* Fix docs
2023-06-16 14:26:36 +08:00
keanu
1a1b9fd236 RNNLM model support lm_num_thread and lm_provider setting (#173)
* rnnlm model inference supports num_threads setting

* rnnlm params decouple num_thread and provider with Transducer.

* fix python csrc bug which offline-lm-config.cc and online-lm-config.cc arguments problem

* lm_num_threads and lm_provider set default values

---------

Co-authored-by: cuidongcai1035 <cuidongcai1035@wezhuiyi.com>
2023-06-12 15:51:27 +08:00
keanu
9c017c2ccb rnnlm model inference supports num_threads setting (#169)
Co-authored-by: cuidongcai1035 <cuidongcai1035@wezhuiyi.com>
2023-06-07 09:32:27 +08:00
Fangjun Kuang
86017f9833 Add RNN LM rescore for offline ASR with modified_beam_search (#125) 2023-04-23 17:15:18 +08:00
Fangjun Kuang
423d89e9a5 Support paraformer. (#95) 2023-03-28 17:59:54 +08:00
Fangjun Kuang
dffb0fd43c Refactor offline recognizer. (#94)
* Refactor offline recognizer.

The purpose is to make it easier to support different types of models.
2023-03-27 14:59:40 +08:00