Commit Graph

21 Commits

Author SHA1 Message Date
SilverSulfide
888f74bf3c Re-implement LM rescore for online transducer (#1231)
Co-authored-by: Martins Kronis <martins.kuznecovs@tilde.lv>
2024-09-06 10:01:25 +08:00
Fangjun Kuang
f1cff83ef9 Add address sanitizer and undefined behavior sanitizer (#951) 2024-05-31 13:17:01 +08:00
Karel Vesely
2e45d327a5 Adding temperature scaling on Joiner logits: (#789)
* Adding temperature scaling on Joiner logits:

- T hard-coded to 2.0
- so far best result NCE 0.122 (still not so high)
    - the BPE scores were rescaled with 0.2 (but then also incorrect words
      get high confidence, visually reasonable histograms are for 0.5 scale)
    - BPE->WORD score merging done by min(.) function
      (tried also prob-product, and also arithmetic, geometric, harmonic mean)

- without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best)

Results seem consistent with: https://arxiv.org/abs/2110.15222

Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model.

I also experimented with blank posteriors mixed into the BPE confidences,
but no NCE improvement found, so not pushing that.

Temperature scling added also to the Greedy search confidences.

* making `temperature_scale` configurable from outside
2024-04-26 09:44:26 +08:00
Fangjun Kuang
6bf2099781 Fix code style issues (#774) 2024-04-16 09:46:15 +08:00
Wei Kang
e9e8d755d9 Fix detetion at the tail when using hotwords in streaming model (#638) 2024-03-08 10:04:33 +08:00
Karel Vesely
38c072dcb2 Track token scores (#571)
* add export of per-token scores (ys, lm, context)

- for best path of the modified-beam-search decoding of transducer

* refactoring JSON export of OnlineRecognitionResult, extending pybind11 API of OnlineRecognitionResult

* export per-token scores also for greedy-search (online-transducer)

- export un-scaled lm_probs (modified-beam search, online-transducer)
- polishing

* fill lm_probs/context_scores only if LM/ContextGraph is present (make Result smaller)
2024-02-29 06:28:45 +08:00
chiiyeh
e7b18a2139 add blank_penalty for online transducer (#548) 2024-01-26 12:12:13 +08:00
Wei Kang
b6c020901a decoder for open vocabulary keyword spotting (#505)
* various fixes to ContextGraph to support open vocabulary keywords decoder

* Add keyword spotter runtime

* Add binary

* First version works

* Minor fixes

* update text2token

* default values

* Add jni for kws

* add kws android project

* Minor fixes

* Remove unused interface

* Minor fixes

* Add workflow

* handle extra info in texts

* Minor fixes

* Add more comments

* Fix ci

* fix cpp style

* Add input box in android demo so that users can specify their keywords

* Fix cpp style

* Fix comments

* Minor fixes

* Minor fixes

* minor fixes

* Minor fixes

* Minor fixes

* Add CI

* Fix code style

* cpplint

* Fix comments

* Fix error
2024-01-20 22:52:41 +08:00
HieDean
e6a2d0da3b Replace Clone() with View() (#432)
Co-authored-by: hiedean <hiedean@tju.edu.cn>
2023-11-20 09:20:50 +08:00
Fangjun Kuang
a12ebfab22 treat unk as blank (#299) 2023-09-07 15:12:29 +08:00
Fangjun Kuang
aa48b76d4b Fix initial tokens to decoding (#246) 2023-08-09 12:33:47 +08:00
Wei Kang
513dfaa552 Support contextual-biasing for streaming model (#184)
* Support contextual-biasing for streaming model

* The whole pipeline runs normally

* Fix comments
2023-06-30 16:46:24 +08:00
PF Luo
655c619bf3 Fix lm fusion (#157)
* share GetHypsRowSplits interface and fix getting Topk not taking logprob

* fix lm score of lm fusion and make padding len same with 'icefall/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/decode.py'
2023-05-15 10:48:45 +08:00
PF Luo
824b0809a4 add shallow fusion (#147) 2023-05-10 22:30:57 +08:00
PF Luo
8c6a6768d5 Add lm rescore to online-modified-beam-search (#133) 2023-05-05 21:23:54 +08:00
PF Luo
aa7108729b share GetHypsRowSplits interface and fix getting Topk not taking logprob (#131) 2023-04-26 11:41:04 +08:00
Fangjun Kuang
86017f9833 Add RNN LM rescore for offline ASR with modified_beam_search (#125) 2023-04-23 17:15:18 +08:00
Fangjun Kuang
ad05f52666 Add timestamps for streaming ASR. (#123) 2023-04-19 16:02:37 +08:00
Fangjun Kuang
9d8fddef01 Support resampling (#77) 2023-03-03 16:42:33 +08:00
Fangjun Kuang
7f72c13d9a Code refactoring (#74)
* Don't reset model state and feature extractor on endpointing

* support passing decoding_method from commandline

* Add modified_beam_search to Python API

* fix C API example

* Fix style issues
2023-03-03 12:10:59 +08:00
PF Luo
5326d0f81f add modified beam search (#69) 2023-03-01 15:32:54 +08:00