Commit Graph

25 Commits

Author SHA1 Message Date
Fangjun Kuang
669f5ef441 Add C++ runtime and Python APIs for Moonshine models (#1473) 2024-10-26 14:34:07 +08:00
Fangjun Kuang
70568c2df7 Support Agglomerative clustering. (#1384)
We use the open-source implementation from
https://github.com/cdalitz/hclust-cpp
2024-09-29 23:44:29 +08:00
Fangjun Kuang
544857b097 Fix building (#1343) 2024-09-13 13:33:52 +08:00
Fangjun Kuang
6b6e7635ed Fix computing features for CED audio tagging models. (#1341)
See also
https://github.com/RicherMans/CED/blob/main/onnx_inference_with_kaldi.py
2024-09-12 19:38:18 +08:00
Robin Zhong
62c4d4ab62 Add emotion, event of SenseVoice. (#1257)
* Add emotion, event of SenseVoice.

* Fix tokens size check and update java api.

https://github.com/k2-fsa/sherpa-onnx/pull/1257
2024-08-14 15:50:13 +08:00
Fangjun Kuang
994c3e7c96 Add VAD + Non-streaming ASR example for JavaScript API. (#1170) 2024-07-26 12:42:08 +08:00
Fangjun Kuang
117cd7bb8c Support whisper large/large-v1/large-v2/large-v3 and distil-large-v2 (#1114) 2024-07-12 23:47:39 +08:00
Fangjun Kuang
a11c859971 Support clang-tidy (#1034) 2024-06-19 20:51:57 +08:00
Fangjun Kuang
1a43d1e37f Support getting word IDs for CTC HLG decoding. (#978) 2024-06-06 14:22:39 +08:00
Fangjun Kuang
fd5a0d1e00 Add C++ runtime for Tele-AI/TeleSpeech-ASR (#970) 2024-06-05 00:26:40 +08:00
Fangjun Kuang
17cd3a5f01 Add C++ runtime for non-streaming faster conformer transducer from NeMo. (#854) 2024-05-10 12:15:39 +08:00
Fangjun Kuang
c1608b3524 Support CED models (#792) 2024-04-19 15:20:37 +08:00
Fangjun Kuang
f20291cadc Support audio tagging using zipformer (#747) 2024-04-10 14:47:06 +08:00
Fangjun Kuang
0be71a31f5 Use high_freq -400 in computing fbank features. (#515)
Fixes #514
2024-01-04 12:39:06 +08:00
Fangjun Kuang
552a267c23 Set is_final and start_time for online websocket server. (#342)
* Set is_final and start_time for online websocket server.

* Convert timestamps to a json array
2023-09-25 15:12:07 +08:00
Fangjun Kuang
43b2b7760d Fix tokens processing for byte-level BPE (#333) 2023-09-22 13:28:19 +08:00
Fangjun Kuang
45b9d4ab37 Support whisper models (#238) 2023-08-07 12:34:18 +08:00
Wei Kang
8562711252 Implement context biasing with a Aho Corasick automata (#145)
* Implement context graph

* Modify the interface to support context biasing

* Support context biasing in modified beam search; add python wrapper

* Support context biasing in python api example

* Minor fixes

* Fix context graph

* Minor fixes

* Fix tests

* Fix style

* Fix style

* Fix comments

* Minor fixes

* Add missing header

* Replace std::shared_ptr with std::unique_ptr for effciency

* Build graph in constructor

* Fix comments

* Minor fixes

* Fix docs
2023-06-16 14:26:36 +08:00
Fangjun Kuang
d7114da441 Minor fixes (#161) 2023-05-23 15:57:33 +08:00
Fangjun Kuang
44821ae2fb Use fixed decimal point for offline timestamp (#158) 2023-05-22 16:52:38 +08:00
cooldoomsday
0bc571f6ee Return timestamp info and tokens in offline ASR
Co-authored-by: zhangbaofeng@npnets.com <41259@Zbf>
2023-05-06 10:20:46 +08:00
Fangjun Kuang
80060c276d Begin to support CTC models (#119)
Please see https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/index.html for a list of pre-trained CTC models from NeMo.
2023-04-07 23:11:34 +08:00
Fangjun Kuang
5d3c8edbc9 add python tests (#111) 2023-04-02 23:05:30 +08:00
Fangjun Kuang
423d89e9a5 Support paraformer. (#95) 2023-03-28 17:59:54 +08:00
Fangjun Kuang
5572246253 Add non-streaming ASR (#92) 2023-03-26 08:53:42 +08:00