C++ API for speaker diarization (#1396)

This commit is contained in:
Fangjun Kuang
2024-10-09 12:01:20 +08:00
committed by GitHub
parent 70165cb42d
commit 59407edcad
39 changed files with 1652 additions and 108 deletions

View File

@@ -3,12 +3,9 @@
Please download test wave files from
https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models
## 0-two-speakers-zh.wav
## 0-four-speakers-zh.wav
This file is from
https://www.modelscope.cn/models/iic/speech_campplus_speaker-diarization_common/file/view/master?fileName=examples%252F2speakers_example.wav&status=0
Note that we have renamed it from `2speakers_example.wav` to `0-two-speakers-zh.wav`.
It is recorded by @csukuangfj
## 1-two-speakers-en.wav
@@ -40,5 +37,5 @@ commands to convert it to `3-two-speakers-en.wav`
```bash
sox ML16091-Audio.mp3 3-two-speakers-en.wav
sox ML16091-Audio.mp3 -r 16k 3-two-speakers-en.wav
```

View File

@@ -72,7 +72,7 @@ def main():
model.receptive_field.duration * 16000
)
opset_version = 18
opset_version = 13
filename = "model.onnx"
torch.onnx.export(