Commit Graph

273 Commits

Author SHA1 Message Date
Fangjun Kuang
2086f8c55b Add Go API for Kokoro TTS models (#1722) 2025-01-16 17:35:31 +08:00
Fangjun Kuang
cc812e6237 Add C# API for Kokoro TTS models (#1720) 2025-01-16 16:30:10 +08:00
Fangjun Kuang
9efe26a646 Export kokoro to sherpa-onnx (#1713) 2025-01-15 16:49:10 +08:00
Fangjun Kuang
0d20558b5e Fix passing strings from C# to C. (#1701)
See also
https://github.com/k2-fsa/sherpa-onnx/issues/1695#issuecomment-2585725190

We need to place a 0 at the end of the buffer.
2025-01-13 10:17:04 +08:00
徐络溟
ecc653871d Fix: export-onnx.py(expected all tensors to be on the same device) (#1699)
由于SenseVoiceSmall.from_pretrained()
    调用的funasr.auto.auto_model.AutoModel.build_model()默认device是cuda
    (在cuda available的环境中)
    ```py
    device = kwargs.get("device", "cuda")
    if not torch.cuda.is_available() or kwargs.get("ngpu", 1) == 0:
        device = "cpu"
        kwargs["batch_size"] = 1
    kwargs["device"] = device
    ```
    而export-onnx.py里的tensor默认都是cpu, 导致
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu
    所以直接在加载model的时候指定cpu
2025-01-10 19:26:36 +08:00
Fangjun Kuang
46330b25cc Add Go API for MatchaTTS models (#1685) 2025-01-06 08:03:03 +08:00
Fangjun Kuang
1fe5fe495f Add Android demo for MatchaTTS models. (#1683) 2025-01-06 06:44:09 +08:00
Fangjun Kuang
3eced3e7ee Add C# and JavaScript (wasm) API for MatchaTTS models (#1682) 2025-01-05 15:08:19 +08:00
Fangjun Kuang
0e299f30f5 Add JavaScript API (node-addon-api) for MatchaTTS models. (#1677) 2025-01-03 15:14:28 +08:00
Fangjun Kuang
49154c957b Add Go API for Keyword spotting (#1662) 2024-12-31 11:25:32 +08:00
Fangjun Kuang
08d771337b Add a byte-level BPE Chinese+English non-streaming zipformer model (#1645) 2024-12-24 16:56:49 +08:00
Fangjun Kuang
fe3265aa25 Add new tts models for Latvia and Persian+English (#1644) 2024-12-24 15:16:02 +08:00
Fangjun Kuang
b76cd9033a Support decoding with byte-level BPE (bbpe) models. (#1633) 2024-12-20 19:21:32 +08:00
Fangjun Kuang
1bae4085ca Add speaker diarization API for HarmonyOS. (#1609) 2024-12-10 16:03:03 +08:00
Fangjun Kuang
84821b1f99 Fix building node-addon package (#1598) 2024-12-06 10:11:18 +08:00
Fangjun Kuang
dc3287f3a8 Add HarmonyOS support for text-to-speech. (#1584) 2024-12-01 21:43:34 +08:00
Fangjun Kuang
c9d3b6cd8c Add microphone demo about VAD+ASR for HarmonyOS (#1581) 2024-11-30 15:23:45 +08:00
Fangjun Kuang
299f2392e2 Add CI to build HAPs for HarmonyOS (#1578) 2024-11-29 21:13:01 +08:00
Fangjun Kuang
315d8e2a47 Publish sherpa_onnx.har for HarmonyOS (#1572) 2024-11-28 17:30:16 +08:00
Fangjun Kuang
c34ab35591 Add Android APK for streaming Paraformer ASR (#1538) 2024-11-14 20:57:35 +08:00
Fangjun Kuang
8436ba834c Add WebAssembly example for VAD + Moonshine models. (#1535) 2024-11-13 21:06:50 +08:00
Fangjun Kuang
a16c9aff8b Add Lazarus example for Moonshine models. (#1532) 2024-11-13 00:04:16 +08:00
Fangjun Kuang
4eeb336f59 Export the English TTS model from MeloTTS (#1509) 2024-11-04 07:54:19 +08:00
Fangjun Kuang
a3c89aa0d8 Add two-pass ASR Android APKs for Moonshine models. (#1499) 2024-10-31 17:54:16 +08:00
Fangjun Kuang
3622104133 Add C# API for Moonshine models. (#1483)
* Also, return timestamps for non-streaming ASR.
2024-10-27 13:14:25 +08:00
Fangjun Kuang
6f261d39f3 Add JavaScript API for Moonshine models (#1480) 2024-10-27 11:31:01 +08:00
Fangjun Kuang
3d3edabb5f Add Go API for Moonshine models (#1479) 2024-10-27 09:39:09 +08:00
Fangjun Kuang
052b8645ba Add Go API examples for adding punctuations to text. (#1478) 2024-10-27 09:04:05 +08:00
Fangjun Kuang
bd4b223920 Add Kotlin and Java API for Moonshine models (#1474) 2024-10-26 22:30:29 +08:00
Fangjun Kuang
b06b460851 Begin to support https://github.com/usefulsensors/moonshine (#1470) 2024-10-26 09:51:16 +08:00
Fangjun Kuang
3d6344ead3 Fix building node-addon for Windows x86. (#1469) 2024-10-25 18:49:33 +08:00
Fangjun Kuang
707cf792c5 Add GigaAM NeMo transducer model for Russian ASR (#1467) 2024-10-25 15:20:13 +08:00
Fangjun Kuang
b41f6d2c94 Support GigaAM CTC models for Russian ASR (#1464)
See also https://github.com/salute-developers/GigaAM
2024-10-25 10:55:16 +08:00
Fangjun Kuang
a5295aad10 Handle NaN embeddings in speaker diarization. (#1461)
See also https://github.com/thewh1teagle/sherpa-rs/issues/33
2024-10-24 14:03:09 +08:00
Fangjun Kuang
b3e05f6dc4 Fix style issues (#1458) 2024-10-24 11:15:08 +08:00
Fangjun Kuang
ceb69ebd94 Add C++ API for non-streaming ASR (#1456) 2024-10-23 16:40:12 +08:00
Fangjun Kuang
effd5ef2be Add C++ API for streaming ASR. (#1455)
It is a wrapper around the C API.
2024-10-23 12:07:43 +08:00
Fangjun Kuang
e0586f1876 add more models for speaker diarization (#1440) 2024-10-17 20:03:09 +08:00
Fangjun Kuang
620597f501 Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437) 2024-10-17 11:58:14 +08:00
Fangjun Kuang
593b96758b Add Go API for offline punctuation models (#1434)
It is contributed by a community user 
from [our QQ group](https://k2-fsa.github.io/sherpa/social-groups.html#qq).
2024-10-16 17:16:47 +08:00
Fangjun Kuang
df4150dc5d Upload speaker embedding models to huggingface (#1428)
See also
https://huggingface.co/spaces/k2-fsa/speaker-diarization
2024-10-14 16:20:00 +08:00
Fangjun Kuang
5a22f74b2b Android demo for speaker diarization (#1423) 2024-10-13 14:02:57 +08:00
Fangjun Kuang
1ed803adc1 Dart API for speaker diarization (#1418) 2024-10-11 21:17:41 +08:00
Fangjun Kuang
eefc172095 JavaScript API with WebAssembly for speaker diarization (#1414)
#1408 uses [node-addon-api](https://github.com/nodejs/node-addon-api) to call C API from JavaScript, whereas this pull request uses WebAssembly to call C API from JavaScript.
2024-10-11 11:40:10 +08:00
Fangjun Kuang
1d061df355 WebAssembly exmaple for speaker diarization (#1411) 2024-10-10 22:14:45 +08:00
Fangjun Kuang
67349b52f2 JavaScript API (node-addon) for speaker diarization (#1408) 2024-10-10 15:51:31 +08:00
Fangjun Kuang
a45e5dba99 C# API for speaker diarization (#1407) 2024-10-10 14:29:05 +08:00
Fangjun Kuang
df681e9807 Go API for speaker diarization (#1403) 2024-10-09 20:10:44 +08:00
Yongzeng Liu
97654122fa docs(nodejs-addon-examples): add guide for pnpm user (#1401) 2024-10-09 18:12:41 +08:00
Fangjun Kuang
59407edcad C++ API for speaker diarization (#1396) 2024-10-09 12:01:20 +08:00