decoder for open vocabulary keyword spotting (#505)
* various fixes to ContextGraph to support open vocabulary keywords decoder * Add keyword spotter runtime * Add binary * First version works * Minor fixes * update text2token * default values * Add jni for kws * add kws android project * Minor fixes * Remove unused interface * Minor fixes * Add workflow * handle extra info in texts * Minor fixes * Add more comments * Fix ci * fix cpp style * Add input box in android demo so that users can specify their keywords * Fix cpp style * Fix comments * Minor fixes * Minor fixes * minor fixes * Minor fixes * Minor fixes * Add CI * Fix code style * cpplint * Fix comments * Fix error
This commit is contained in:
@@ -26,7 +26,32 @@ namespace sherpa_onnx {
|
||||
* otherwise returns false.
|
||||
*/
|
||||
bool EncodeHotwords(std::istream &is, const SymbolTable &symbol_table,
|
||||
std::vector<std::vector<int32_t>> *hotwords);
|
||||
std::vector<std::vector<int32_t>> *hotwords_id);
|
||||
|
||||
/* Encode the keywords in an input stream to be tokens ids.
|
||||
*
|
||||
* @param is The input stream, it contains several lines, one hotword for each
|
||||
* line. For each hotword, the tokens (cjkchar or bpe) are separated
|
||||
* by spaces, it might contain boosting score (starting with :),
|
||||
* triggering threshold (starting with #) and keyword string (starting
|
||||
* with @) too.
|
||||
* @param symbol_table The tokens table mapping symbols to ids. All the symbols
|
||||
* in the stream should be in the symbol_table, if not this
|
||||
* function returns fasle.
|
||||
*
|
||||
* @param keywords_id The encoded ids to be written to.
|
||||
* @param keywords The original keyword string to be written to.
|
||||
* @param boost_scores The boosting score for each keyword to be written to.
|
||||
* @param threshold The triggering threshold for each keyword to be written to.
|
||||
*
|
||||
* @return If all the symbols from ``is`` are in the symbol_table, returns true
|
||||
* otherwise returns false.
|
||||
*/
|
||||
bool EncodeKeywords(std::istream &is, const SymbolTable &symbol_table,
|
||||
std::vector<std::vector<int32_t>> *keywords_id,
|
||||
std::vector<std::string> *keywords,
|
||||
std::vector<float> *boost_scores,
|
||||
std::vector<float> *threshold);
|
||||
|
||||
} // namespace sherpa_onnx
|
||||
|
||||
|
||||
Reference in New Issue
Block a user