enginex-mr_series-sherpa-onnx/README.md

### Supported functions

|Speech recognition| Speech synthesis | Speaker verification | Speaker identification |
|------------------|------------------|----------------------|------------------------|
|   ✔️              |         ✔️        |          ✔️           |                ✔️       |

| Spoken Language identification | Audio tagging | Voice activity detection | Keyword spotting |
|--------------------------------|---------------|--------------------------|------------------|
|                 ✔️              |          ✔️    |                ✔️         |         ✔️        |

### Supported platforms

|Architecture| Android          | iOS           | Windows    | macOS | linux |
|------------|------------------|---------------|------------|-------|-------|
|   x64      |  ✔️               |               |   ✔️        | ✔️     |  ✔️    |
|   x86      |  ✔️               |               |   ✔️        |       |       |
|   arm64    |  ✔️               | ✔️             |   ✔️        | ✔️     |  ✔️    |
|   arm32    |  ✔️               |               |            |       |  ✔️    |
|   riscv64  |                  |               |            |       |  ✔️    |


### Supported programming languages

| C++ | C  | Python | C# | Java | JavaScript | Kotlin | Swift | Go | Dart |
|-----|----|--------|----|------|------------|--------|-------|----|------|
| ✔️   | ✔️  | ✔️      | ✔️  |  ✔️   | ✔️          | ✔️      |  ✔️    | ✔️  |  ✔️   |

It also supports WebAssembly.

## Introduction

This repository supports running the following functions **locally**

  - Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
  - Text-to-speech (i.e., TTS)
  - Speaker identification
  - Speaker verification
  - Spoken language identification
  - Audio tagging
  - VAD (e.g., [silero-vad](https://github.com/snakers4/silero-vad))
  - Keyword spotting

on the following platforms and operating systems:

  - x86, ``x86_64``, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)
  - Linux, macOS, Windows, openKylin
  - Android, WearOS
  - iOS
  - NodeJS
  - WebAssembly
  - [Raspberry Pi](https://www.raspberrypi.com/)
  - [RV1126](https://www.rock-chips.com/uploads/pdf/2022.8.26/191/RV1126%20Brief%20Datasheet.pdf)
  - [LicheePi4A](https://sipeed.com/licheepi4a)
  - [VisionFive 2](https://www.starfivetech.com/en/site/boards)
  - [旭日X3派](https://developer.horizon.ai/api/v1/fileData/documents_pi/index.html)
  - etc

with the following APIs

  - C++, C, Python, Go, ``C#``
  - Java, Kotlin, JavaScript
  - Swift
  - Dart

### Links for pre-built Android APKs

| Description                    | URL                                                                                     | 中国用户                                                                             |
|--------------------------------|-----------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|
| Streaming speech recognition             | [Address](https://k2-fsa.github.io/sherpa/onnx/android/apk.html)                        | [点此](https://k2-fsa.github.io/sherpa/onnx/android/apk-cn.html)                        |
| Text-to-speech | [Address](https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html)                     | [点此](https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine-cn.html)                     |
|Voice activity detection (VAD) | [Address](https://k2-fsa.github.io/sherpa/onnx/vad/apk.html) | [点此](https://k2-fsa.github.io/sherpa/onnx/vad/apk-cn.html)|
|VAD + non-streaming speech recognition| [Address](https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr.html)| [点此](https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr-cn.html)|
|Two-pass speech recognition| [Address](https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass.html)| [点此](https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass-cn.html)|
| Audio tagging                  | [Address](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html)                  | [点此](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-cn.html)                  |
| Audio tagging (WearOS)         | [Address](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos.html)           | [点此](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos-cn.html)           |
| Speaker identification         | [Address](https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html)         | [点此](https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk-cn.html)         |
| Spoken language identification | [Address](https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk.html) | [点此](https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk-cn.html) |
|Keyword spotting| [Address](https://k2-fsa.github.io/sherpa/onnx/kws/apk.html)| [点此](https://k2-fsa.github.io/sherpa/onnx/kws/apk-cn.html)|

### Links for pre-built Flutter APPs

| Description                    | URL                                                                                     | 中国用户                                                                             |
|--------------------------------|-----------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|
| Streaming speech recognition             | [Address](https://k2-fsa.github.io/sherpa/onnx/flutter/asr/app.html)                        | [点此](https://k2-fsa.github.io/sherpa/onnx/flutter/asr/app-cn.html)                        |

### Links for pre-trained models

| Description                    | URL                                                                                                                            |
|--------------------------------|--------------------------------------------------------------------------------------------------------------------------------|
| Speech recognition (speech to text, ASR)             | [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models)              |
| Text-to-speech (TTS)                 | [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models)                             |
| VAD | [Address](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx)|
| Keyword spotting |[Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models)|
| Audio tagging                  | [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models)|
| Speaker identification (Speaker ID)         | [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models)|
| Spoken language identification (Language ID) | See multi-lingual Whisper ASR models from  [Speech recognition](https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models) |
| Punctuation| [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models)|

### Useful links

- Documentation: https://k2-fsa.github.io/sherpa/onnx/
- Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi

### How to reach us

Please see
https://k2-fsa.github.io/sherpa/social-groups.html
for 新一代 Kaldi **微信交流群** and **QQ 交流群**.
Update README (#1004) 2024-06-14 13:06:52 +08:00			`### Supported functions`

			`\|Speech recognition\| Speech synthesis \| Speaker verification \| Speaker identification \|`
			`\|------------------\|------------------\|----------------------\|------------------------\|`
			`\| ✔️ \| ✔️ \| ✔️ \| ✔️ \|`

			`\| Spoken Language identification \| Audio tagging \| Voice activity detection \| Keyword spotting \|`
			`\|--------------------------------\|---------------\|--------------------------\|------------------\|`
			`\| ✔️ \| ✔️ \| ✔️ \| ✔️ \|`

			`### Supported platforms`

			`\|Architecture\| Android \| iOS \| Windows \| macOS \| linux \|`
			`\|------------\|------------------\|---------------\|------------\|-------\|-------\|`
			`\| x64 \| ✔️ \| \| ✔️ \| ✔️ \| ✔️ \|`
			`\| x86 \| ✔️ \| \| ✔️ \| \| \|`
			`\| arm64 \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ✔️ \|`
			`\| arm32 \| ✔️ \| \| \| \| ✔️ \|`
			`\| riscv64 \| \| \| \| \| ✔️ \|`


			`### Supported programming languages`

			`\| C++ \| C \| Python \| C# \| Java \| JavaScript \| Kotlin \| Swift \| Go \| Dart \|`
			`\|-----\|----\|--------\|----\|------\|------------\|--------\|-------\|----\|------\|`
			`\| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ✔️ \|`

			`It also supports WebAssembly.`

			`## Introduction`
Add README 2022-09-02 20:29:07 +08:00
Update README (#572) 2024-02-03 09:20:08 +08:00			`This repository supports running the following functions locally`
Release v1.0 (#18) * Release v1.0 * Add link to doc 2022-10-14 12:13:51 +08:00
Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00			`- Speech-to-text (i.e., ASR); both streaming and non-streaming are supported`
Update README (#572) 2024-02-03 09:20:08 +08:00			`- Text-to-speech (i.e., TTS)`
			`- Speaker identification`
Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00			`- Speaker verification`
			`- Spoken language identification`
			`- Audio tagging`
			`- VAD (e.g., [silero-vad](https://github.com/snakers4/silero-vad))`
Publish node-addon-api npm package for linux arm64 (#841) 2024-05-07 23:05:40 +08:00			`- Keyword spotting`
Add README 2022-09-02 20:29:07 +08:00
Update README (#572) 2024-02-03 09:20:08 +08:00			`on the following platforms and operating systems:`

Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00			- x86, ``x86_64``, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)
			`- Linux, macOS, Windows, openKylin`
			`- Android, WearOS`
Update README (#572) 2024-02-03 09:20:08 +08:00			`- iOS`
Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00			`- NodeJS`
			`- WebAssembly`
			`- [Raspberry Pi](https://www.raspberrypi.com/)`
			`- [RV1126](https://www.rock-chips.com/uploads/pdf/2022.8.26/191/RV1126%20Brief%20Datasheet.pdf)`
			`- [LicheePi4A](https://sipeed.com/licheepi4a)`
			`- [VisionFive 2](https://www.starfivetech.com/en/site/boards)`
			`- [旭日X3派](https://developer.horizon.ai/api/v1/fileData/documents_pi/index.html)`
Update README (#572) 2024-02-03 09:20:08 +08:00			`- etc`

Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00			`with the following APIs`

Publish node-addon-api npm package for linux arm64 (#841) 2024-05-07 23:05:40 +08:00			- C++, C, Python, Go, ``C#``
			`- Java, Kotlin, JavaScript`
Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00			`- Swift`
add more text-to-speech models from piper (#988) 2024-06-11 15:22:48 +08:00			`- Dart`
Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00
Update README (#1004) 2024-06-14 13:06:52 +08:00			`### Links for pre-built Android APKs`
Add links to pre-built APKs and pre-trained models to README. (#840) 2024-05-07 12:28:42 +08:00
			`\| Description \| URL \| 中国用户 \|`
			`\|--------------------------------\|-----------------------------------------------------------------------------------------\|--------------------------------------------------------------------------------------\|`
			`\| Streaming speech recognition \| [Address](https://k2-fsa.github.io/sherpa/onnx/android/apk.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/android/apk-cn.html) \|`
			`\| Text-to-speech \| [Address](https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine-cn.html) \|`
			`\|Voice activity detection (VAD) \| [Address](https://k2-fsa.github.io/sherpa/onnx/vad/apk.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/vad/apk-cn.html)\|`
			`\|VAD + non-streaming speech recognition\| [Address](https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr.html)\| [点此](https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr-cn.html)\|`
			`\|Two-pass speech recognition\| [Address](https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass.html)\| [点此](https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass-cn.html)\|`
			`\| Audio tagging \| [Address](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-cn.html) \|`
			`\| Audio tagging (WearOS) \| [Address](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos-cn.html) \|`
			`\| Speaker identification \| [Address](https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk-cn.html) \|`
			`\| Spoken language identification \| [Address](https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk-cn.html) \|`
			`\|Keyword spotting\| [Address](https://k2-fsa.github.io/sherpa/onnx/kws/apk.html)\| [点此](https://k2-fsa.github.io/sherpa/onnx/kws/apk-cn.html)\|`

Update READEM to include links to pre-built flutter APPs (#1043) 2024-06-23 14:31:56 +08:00			`### Links for pre-built Flutter APPs`

			`\| Description \| URL \| 中国用户 \|`
			`\|--------------------------------\|-----------------------------------------------------------------------------------------\|--------------------------------------------------------------------------------------\|`
			`\| Streaming speech recognition \| [Address](https://k2-fsa.github.io/sherpa/onnx/flutter/asr/app.html) \| [点此](https://k2-fsa.github.io/sherpa/onnx/flutter/asr/app-cn.html) \|`

Update README (#1004) 2024-06-14 13:06:52 +08:00			`### Links for pre-trained models`
Add links to pre-built APKs and pre-trained models to README. (#840) 2024-05-07 12:28:42 +08:00
			`\| Description \| URL \|`
			`\|--------------------------------\|--------------------------------------------------------------------------------------------------------------------------------\|`
			`\| Speech recognition (speech to text, ASR) \| [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models) \|`
			`\| Text-to-speech (TTS) \| [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models) \|`
			`\| VAD \| [Address](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx)\|`
			`\| Keyword spotting \|[Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models)\|`
			`\| Audio tagging \| [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models)\|`
			`\| Speaker identification (Speaker ID) \| [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models)\|`
			`\| Spoken language identification (Language ID) \| See multi-lingual Whisper ASR models from [Speech recognition](https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models) \|`
			`\| Punctuation\| [Address](https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models)\|`

Update README (#1004) 2024-06-14 13:06:52 +08:00			`### Useful links`
Update README (#572) 2024-02-03 09:20:08 +08:00
			`- Documentation: https://k2-fsa.github.io/sherpa/onnx/`
Add C++ microphone examples for audio tagging (#749) 2024-04-10 21:00:35 +08:00			`- Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi`
Update README (#572) 2024-02-03 09:20:08 +08:00
Update README (#1004) 2024-06-14 13:06:52 +08:00			`### How to reach us`
Update README (#572) 2024-02-03 09:20:08 +08:00
			`Please see`
			`https://k2-fsa.github.io/sherpa/social-groups.html`
			`for 新一代 Kaldi 微信交流群 and QQ 交流群.`