2023-11-21 23:20:08 +08:00
# Introduction
2025-02-21 21:47:21 +08:00
Note: You need `Node >= 18` .
Note: For Mac M1 and other silicon chip series, do check the example `test-online-paraformer-microphone-mic.js`
2024-05-04 13:27:39 +08:00
2023-11-21 23:20:08 +08:00
This directory contains nodejs examples for [sherpa-onnx ](https://github.com/k2-fsa/sherpa-onnx ).
2024-05-04 13:27:39 +08:00
It uses WebAssembly to wrap `sherpa-onnx` for NodeJS and it does not support multiple threads.
Note: [../nodejs-addon-examples ](../nodejs-addon-examples ) uses
[node-addon-api ](https://github.com/nodejs/node-addon-api ) to wrap
`sherpa-onnx` for NodeJS and it supports multiple threads.
2024-03-03 20:00:36 +08:00
Before you continue, please first run
2023-11-21 23:20:08 +08:00
```bash
2024-03-03 20:00:36 +08:00
cd ./nodejs-examples
npm i
2023-11-21 23:20:08 +08:00
```
In the following, we describe how to use [sherpa-onnx ](https://github.com/k2-fsa/sherpa-onnx )
for text-to-speech and speech-to-text.
2025-03-15 17:41:23 +08:00
# Speech enhancement
In the following, we demonstrate how to run speech enhancement.
```bash
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav
node ./test-offline-speech-enhancement-gtcrn.js
```
2024-10-11 11:40:10 +08:00
# Speaker diarization
In the following, we demonstrate how to run speaker diarization.
```bash
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2
tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2
rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav
node ./test-offline-speaker-diarization.js
```
2023-11-21 23:20:08 +08:00
# Text-to-speech
In the following, we demonstrate how to run text-to-speech.
2025-01-17 11:17:18 +08:00
## ./test-offline-tts-kokoro-en.js
[./test-offline-tts-kokoro-en.js ](./test-offline-tts-kokoro-en.js ) shows how to use
[kokoro-en-v0_19 ](https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2 )
for text-to-speech.
You can use the following command to run it:
```bash
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2
tar xf kokoro-en-v0_19.tar.bz2
rm kokoro-en-v0_19.tar.bz2
node ./test-offline-tts-kokoro-en.js
```
2025-01-05 15:08:19 +08:00
## ./test-offline-tts-matcha-zh.js
2023-11-21 23:20:08 +08:00
2025-01-05 15:08:19 +08:00
[./test-offline-tts-matcha-zh.js ](./test-offline-tts-matcha-zh.js ) shows how to use
[matcha-icefall-zh-baker ](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker )
for text-to-speech.
You can use the following command to run it:
```bash
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2
tar xvf matcha-icefall-zh-baker.tar.bz2
rm matcha-icefall-zh-baker.tar.bz2
2025-03-17 17:05:15 +08:00
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx
2025-01-05 15:08:19 +08:00
node ./test-offline-tts-matcha-zh.js
```
## ./test-offline-tts-matcha-en.js
[./test-offline-tts-matcha-en.js ](./test-offline-tts-matcha-en.js ) shows how to use
[matcha-icefall-en_US-ljspeech ](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker )
for text-to-speech.
You can use the following command to run it:
```bash
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2
2025-01-06 07:23:45 +08:00
tar xf matcha-icefall-en_US-ljspeech.tar.bz2
2025-01-05 15:08:19 +08:00
rm matcha-icefall-en_US-ljspeech.tar.bz2
2025-03-17 17:05:15 +08:00
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx
2025-01-05 15:08:19 +08:00
node ./test-offline-tts-matcha-en.js
```
## ./test-offline-tts-vits-en.js
[./test-offline-tts-vits-en.js ](./test-offline-tts-vits-en.js ) shows how to use
2023-11-30 23:57:43 +08:00
[vits-piper-en_US-amy-low.tar.bz2 ](https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2 )
2023-11-21 23:20:08 +08:00
for text-to-speech.
You can use the following command to run it:
```bash
2023-11-30 23:57:43 +08:00
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2
tar xvf vits-piper-en_US-amy-low.tar.bz2
2025-01-05 15:08:19 +08:00
node ./test-offline-tts-vits-en.js
2023-11-21 23:20:08 +08:00
```
2025-01-05 15:08:19 +08:00
## ./test-offline-tts-vits-zh.js
2023-11-21 23:20:08 +08:00
2025-01-05 15:08:19 +08:00
[./test-offline-tts-vits-zh.js ](./test-offline-tts-vits-zh.js ) shows how to use
2023-11-21 23:20:08 +08:00
a VITS pretrained model
[aishell3 ](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#vits-model-aishell3 )
for text-to-speech.
You can use the following command to run it:
```bash
2024-04-08 11:01:30 +08:00
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2
tar xvf vits-icefall-zh-aishell3.tar.bz2
2025-01-05 15:08:19 +08:00
node ./test-offline-tts-vits-zh.js
2023-11-21 23:20:08 +08:00
```
# Speech-to-text
In the following, we demonstrate how to decode files and how to perform
2024-03-03 20:00:36 +08:00
speech recognition with a microphone with `nodejs` .
2023-11-21 23:20:08 +08:00
2025-04-03 15:02:06 +08:00
## ./test-offline-dolphin-ctc.js
[./test-offline-dolphin-ctc.js ](./test-offline-dolphin-ctc.js ) demonstrates
how to decode a file with a [Dolphin ](https://github.com/DataoceanAI/Dolphin ) CTC model.
You can use the following command to run it:
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2
tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2
rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2
node ./test-offline-dolphin-ctc.js
```
2023-11-21 23:20:08 +08:00
## ./test-offline-nemo-ctc.js
[./test-offline-nemo-ctc.js ](./test-offline-nemo-ctc.js ) demonstrates
how to decode a file with a NeMo CTC model. In the code we use
[stt_en_conformer_ctc_small ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/english.html#stt-en-conformer-ctc-small ).
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-conformer-small.tar.bz2
tar xvf sherpa-onnx-nemo-ctc-en-conformer-small.tar.bz2
node ./test-offline-nemo-ctc.js
```
## ./test-offline-paraformer.js
[./test-offline-paraformer.js ](./test-offline-paraformer.js ) demonstrates
how to decode a file with a non-streaming Paraformer model. In the code we use
2024-07-10 17:05:26 +08:00
[sherpa-onnx-paraformer-zh-2023-09-14 ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese ).
2023-11-21 23:20:08 +08:00
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
2024-07-10 17:05:26 +08:00
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2
tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2
2023-11-21 23:20:08 +08:00
node ./test-offline-paraformer.js
```
2025-04-28 20:47:49 +08:00
## ./test-offline-sense-voice-with-hr.js
[./test-offline-sense-voice-with-hr.js ](./test-offline-sense-voice-with-hr.js ) demonstrates
how to decode a file with a non-streaming SenseVoice model with homophone replacer.
You can use the following command to run it:
```bash
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2
tar xf dict.tar.bz2
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt
node ./test-offline-sense-voice-with-hr.js
```
2024-07-21 15:39:55 +08:00
## ./test-offline-sense-voice.js
[./test-offline-sense-voice.js ](./test-offline-sense-voice.js ) demonstrates
2025-04-28 20:47:49 +08:00
how to decode a file with a non-streaming SenseVoice model.
2024-07-21 15:39:55 +08:00
You can use the following command to run it:
```bash
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
node ./test-offline-sense-voice.js
```
2023-11-21 23:20:08 +08:00
## ./test-offline-transducer.js
[./test-offline-transducer.js ](./test-offline-transducer.js ) demonstrates
how to decode a file with a non-streaming transducer model. In the code we use
[sherpa-onnx-zipformer-en-2023-06-26 ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-zipformer-en-2023-06-26-english ).
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-06-26.tar.bz2
tar xvf sherpa-onnx-zipformer-en-2023-06-26.tar.bz2
node ./test-offline-transducer.js
```
2024-10-27 11:31:01 +08:00
## ./test-vad-with-non-streaming-asr-whisper.js
[./test-vad-with-non-streaming-asr-whisper.js ](./test-vad-with-non-streaming-asr-whisper.js )
shows how to use VAD + whisper to decode a very long file.
You can use the following command to run it:
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2
tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
node ./test-vad-with-non-streaming-asr-whisper.js
```
2023-11-21 23:20:08 +08:00
## ./test-offline-whisper.js
2024-10-27 11:31:01 +08:00
2023-11-21 23:20:08 +08:00
[./test-offline-whisper.js ](./test-offline-whisper.js ) demonstrates
how to decode a file with a Whisper model. In the code we use
[sherpa-onnx-whisper-tiny.en ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html ).
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2
tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2
node ./test-offline-whisper.js
```
2025-02-17 12:54:18 +08:00
## ./test-offline-fire-red-asr.js
[./test-offline-fire-red-asr.js ](./test-offline-fire-red-asr.js ) demonstrates
how to decode a file with a FireRedAsr AED model.
You can use the following command to run it:
```bash
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
node ./test-offline-fire-red-asr.js
```
2024-10-27 11:31:01 +08:00
## ./test-offline-moonshine.js
[./test-offline-moonshine.js ](./test-offline-moonshine.js ) demonstrates
how to decode a file with a Moonshine model. In the code we use
[sherpa-onnx-moonshine-tiny-en-int8 ](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2 ).
You can use the following command to run it:
```bash
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
node ./test-offline-moonshine.js
```
## ./test-vad-with-non-streaming-asr-moonshine.js
[./test-vad-with-non-streaming-asr-moonshine.js ](./test-vad-with-non-streaming-asr-moonshine.js )
shows how to use VAD + whisper to decode a very long file.
You can use the following command to run it:
```bash
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
node ./test-vad-with-non-streaming-asr-moonshine.js
```
2023-11-21 23:20:08 +08:00
## ./test-online-paraformer-microphone.js
2024-10-27 11:31:01 +08:00
2023-11-21 23:20:08 +08:00
[./test-online-paraformer-microphone.js ](./test-online-paraformer-microphone.js )
demonstrates how to do real-time speech recognition from microphone
with a streaming Paraformer model. In the code we use
[sherpa-onnx-streaming-paraformer-bilingual-zh-en ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english ).
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
node ./test-online-paraformer-microphone.js
```
2025-02-21 21:47:21 +08:00
## ./test-online-paraformer-microphone-mic.js
[./test-online-paraformer-microphone-mic.js ](./test-online-paraformer-microphone-mic.js )
demonstrates how to do real-time speech recognition from microphone
with a streaming Paraformer model. In the code we use
[sherpa-onnx-streaming-paraformer-bilingual-zh-en ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english ).
It uses `mic` for better compatibility, do check its [npm ](https://www.npmjs.com/package/mic ) before running it.
You can use the following command to run it:
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
node ./test-online-paraformer-microphone-mic.js
```
2023-11-21 23:20:08 +08:00
## ./test-online-paraformer.js
[./test-online-paraformer.js ](./test-online-paraformer.js ) demonstrates
how to decode a file using a streaming Paraformer model. In the code we use
[sherpa-onnx-streaming-paraformer-bilingual-zh-en ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english ).
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
node ./test-online-paraformer.js
```
## ./test-online-transducer-microphone.js
[./test-online-transducer-microphone.js ](./test-online-transducer-microphone.js )
demonstrates how to do real-time speech recognition with microphone using a streaming transducer model. In the code
we use [sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english ).
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
node ./test-online-transducer-microphone.js
```
## ./test-online-transducer.js
[./test-online-transducer.js ](./test-online-transducer.js ) demonstrates
how to decode a file using a streaming transducer model. In the code
we use [sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english ).
2023-12-22 13:46:33 +08:00
You can use the following command to run it:
2023-11-21 23:20:08 +08:00
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
node ./test-online-transducer.js
```
2023-12-22 13:46:33 +08:00
## ./test-online-zipformer2-ctc.js
[./test-online-zipformer2-ctc.js ](./test-online-zipformer2-ctc.js ) demonstrates
how to decode a file using a streaming zipformer2 CTC model. In the code
we use [sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13 ](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/zipformer-ctc-models.html#sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13-chinese ).
You can use the following command to run it:
```bash
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2
tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2
node ./test-online-zipformer2-ctc.js
```
2024-04-05 10:31:20 +08:00
## ./test-online-zipformer2-ctc-hlg.js
[./test-online-zipformer2-ctc-hlg.js ](./test-online-zipformer2-ctc-hlg.js ) demonstrates
how to decode a file using a streaming zipformer2 CTC model with HLG. In the code
we use [sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18 ](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2 ).
You can use the following command to run it:
```bash
2024-04-17 12:22:17 +08:00
wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2
2024-04-05 10:31:20 +08:00
tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2
node ./test-online-zipformer2-ctc-hlg.js
```