Adding temperature scaling on Joiner logits: (#789)

* Adding temperature scaling on Joiner logits:

- T hard-coded to 2.0
- so far best result NCE 0.122 (still not so high)
    - the BPE scores were rescaled with 0.2 (but then also incorrect words
      get high confidence, visually reasonable histograms are for 0.5 scale)
    - BPE->WORD score merging done by min(.) function
      (tried also prob-product, and also arithmetic, geometric, harmonic mean)

- without temperature scaling (i.e. scale 1.0), the best NCE was 0.032 (here product merging was best)

Results seem consistent with: https://arxiv.org/abs/2110.15222

Everything tuned on a very-small set of 100 sentences with 813 words and 10.2% WER, a Czech model.

I also experimented with blank posteriors mixed into the BPE confidences,
but no NCE improvement found, so not pushing that.

Temperature scling added also to the Greedy search confidences.

* making `temperature_scale` configurable from outside
This commit is contained in:
Karel Vesely
2024-04-26 03:44:26 +02:00
committed by GitHub
parent 15772d2150
commit 2e45d327a5
9 changed files with 107 additions and 30 deletions

View File

@@ -96,16 +96,23 @@ struct OnlineRecognizerConfig {
float blank_penalty = 0.0;
float temperature_scale = 2.0;
OnlineRecognizerConfig() = default;
OnlineRecognizerConfig(
const FeatureExtractorConfig &feat_config,
const OnlineModelConfig &model_config, const OnlineLMConfig &lm_config,
const OnlineModelConfig &model_config,
const OnlineLMConfig &lm_config,
const EndpointConfig &endpoint_config,
const OnlineCtcFstDecoderConfig &ctc_fst_decoder_config,
bool enable_endpoint, const std::string &decoding_method,
int32_t max_active_paths, const std::string &hotwords_file,
float hotwords_score, float blank_penalty)
bool enable_endpoint,
const std::string &decoding_method,
int32_t max_active_paths,
const std::string &hotwords_file,
float hotwords_score,
float blank_penalty,
float temperature_scale)
: feat_config(feat_config),
model_config(model_config),
lm_config(lm_config),
@@ -114,9 +121,10 @@ struct OnlineRecognizerConfig {
enable_endpoint(enable_endpoint),
decoding_method(decoding_method),
max_active_paths(max_active_paths),
hotwords_score(hotwords_score),
hotwords_file(hotwords_file),
blank_penalty(blank_penalty) {}
hotwords_score(hotwords_score),
blank_penalty(blank_penalty),
temperature_scale(temperature_scale) {}
void Register(ParseOptions *po);
bool Validate() const;