Add non-streaming speech recognition examples for MFC (#212)

This commit is contained in:
Fangjun Kuang
2023-07-14 17:00:14 +08:00
committed by GitHub
parent bebc1f1398
commit 0abd7ce881
22 changed files with 1153 additions and 63 deletions

View File

@@ -3,11 +3,19 @@
This directory contains examples showing how to use Next-gen Kaldi in MFC
for speech recognition.
Caution: You need to use Windows and install Visual Studio in order to run it.
Caution: You need to use Windows and install Visual Studio 2022 in order to
compile it.
Hint: If you don't want to install Visual Studio, you can find below
about how to download pre-compiled `exe`.
We use bash script below to demonstrate how to use it. Please change
the commands accordingly for Windows.
## Streaming speech recognition
## How to compile
First, we need to compile sherpa-onnx:
```bash
mkdir -p $HOME/open-source
@@ -19,7 +27,6 @@ mkdir build
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=./install ..
cmake --build . --config Release --target install
cd ../mfc-examples
msbuild ./mfc-examples.sln /property:Configuration=Release /property:Platform=x64
@@ -27,26 +34,13 @@ msbuild ./mfc-examples.sln /property:Configuration=Release /property:Platform=x6
# now run the program
./x64/Release/StreamingSpeechRecognition.exe
./x64/Release/NonStreamingSpeechRecognition.exe
```
Note that we also need to download pre-trained models. Please
refer to https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html
for a list of streaming models.
If you don't want to compile the project by yourself, you can download
pre-compiled `exe` from https://github.com/k2-fsa/sherpa-onnx/releases
We use the following model for demonstration.
For instance, you can use the following addresses:
```bash
cd $HOME/open-source/sherpa-onnx/mfc-examples/x64/Release
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx
wget https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/data/lang_char/tokens.txt
# now rename
mv encoder-epoch-12-avg-4-chunk-16-left-128.onnx encoder.onnx
mv decoder-epoch-12-avg-4-chunk-16-left-128.onnx decoder.onnx
mv joiner-epoch-12-avg-4-chunk-16-left-128.onnx joiner.onnx
# Now run it!
./StreamingSpeechRecognition.exe
```
- https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.5.1/sherpa-onnx-streaming-v1.5.1.exe
- https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.5.1/sherpa-onnx-non-streaming-v1.5.1.exe