EngineX-Iluvatar/enginex_bi_series-sherpa-onnx

Archived

This repository has been archived on 2025-08-26. You can view files and clone it, but cannot push or open issues or pull requests.

Files

History

Fangjun Kuang 620597f501 Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437 )

2024-10-17 11:58:14 +08:00

.gitignore

Export Pyannote speaker segmentation models to onnx (#1382 )

2024-09-29 14:23:56 +08:00

export-onnx.py

Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437 )

2024-10-17 11:58:14 +08:00

notes.md

Export Pyannote speaker segmentation models to onnx (#1382 )

2024-09-29 14:23:56 +08:00

preprocess.sh

Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437 )

2024-10-17 11:58:14 +08:00

README.md

C++ API for speaker diarization (#1396 )

2024-10-09 12:01:20 +08:00

run-revai.sh

Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437 )

2024-10-17 11:58:14 +08:00

run.sh

Export Pyannote speaker segmentation models to onnx (#1382 )

2024-09-29 14:23:56 +08:00

show-onnx.py

Export Pyannote speaker segmentation models to onnx (#1382 )

2024-09-29 14:23:56 +08:00

speaker-diarization-onnx.py

Speaker diarization example with onnxruntime Python API (#1395 )

2024-10-06 16:37:29 +08:00

speaker-diarization-torch.py

Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437 )

2024-10-17 11:58:14 +08:00

vad-onnx.py

Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437 )

2024-10-17 11:58:14 +08:00

vad-torch.py

Export Pyannote speaker segmentation models to onnx (#1382 )

2024-09-29 14:23:56 +08:00

README.md

File description

Please download test wave files from https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models

0-four-speakers-zh.wav

It is recorded by @csukuangfj

1-two-speakers-en.wav

This file is from https://github.com/pengzhendong/pyannote-onnx/blob/master/data/test_16k.wav and it contains speeches from two speakers.

Note that we have renamed it from test_16k.wav to 1-two-speakers-en.wav

2-two-speakers-en.wav

This file is from https://huggingface.co/spaces/Xenova/whisper-speaker-diarization

Note that the original file is ./fcf059e3-689f-47ec-a000-bdace87f0113.mp4. We use the following commands to convert it to 2-two-speakers-en.wav.

ffmpeg -i ./fcf059e3-689f-47ec-a000-bdace87f0113.mp4 -ac 1 -ar 16000 ./2-two-speakers-en.wav

3-two-speakers-en.wav

This file is from https://aws.amazon.com/blogs/machine-learning/deploy-a-hugging-face-pyannote-speaker-diarization-model-on-amazon-sagemaker-as-an-asynchronous-endpoint/

Note that the original file is ML16091-Audio.mp3. We use the following commands to convert it to 3-two-speakers-en.wav

sox ML16091-Audio.mp3 -r 16k 3-two-speakers-en.wav