Commit Graph

434 Commits

Author SHA1 Message Date
Wei Kang
e9e8d755d9 Fix detetion at the tail when using hotwords in streaming model (#638) 2024-03-08 10:04:33 +08:00
Fangjun Kuang
f70fdd156c Support using T-head-Semi/csi-nn2 for RISC-V (#637) 2024-03-06 18:21:50 +08:00
Fangjun Kuang
bdf9243940 Allow to not use pre-installed onnxruntime libs. (#636) 2024-03-06 14:40:23 +08:00
Fangjun Kuang
13260cdf49 Use self-compiled onnxruntime shared lib. (#635) 2024-03-06 11:03:24 +08:00
Fangjun Kuang
5dc2eaf2b4 Fix building wheels from source. (#632) 2024-03-04 16:39:51 +08:00
Fangjun Kuang
ed06ced16f Add WebAssembly for NodeJS. (#628) 2024-03-03 20:00:36 +08:00
Fangjun Kuang
ac6825ff11 Refactor WebAssembly for nodejs (#626) 2024-03-02 12:31:36 +08:00
Fangjun Kuang
a65643b594 support onnxruntime v1.17.1 (#624) 2024-03-02 11:44:59 +08:00
Fangjun Kuang
d56964371c Support VITS models from icefall. (#625) 2024-03-01 19:48:38 +08:00
dragon10
93836ff451 fixed variable's spell num_trailing_blanks (#623)
Signed-off-by: lonngxiang <lonngxiang@gmial.com>
Co-authored-by: lonngxiang <lonngxiang@gmial.com>
2024-03-01 17:02:10 +08:00
Fangjun Kuang
e2397cd1a4 Support Android NNAPI. (#622) 2024-03-01 16:39:48 +08:00
Fangjun Kuang
f9db33c926 Add WebAssembly demo for streaming trilingual Paraformer (Chinese+Cantonese+English) (#618) 2024-03-01 15:20:56 +08:00
Fangjun Kuang
c093880d7c Fix building wheels (#620) 2024-03-01 15:20:06 +08:00
Wei Kang
734bbd91dc Add Python API for keyword spotting (#576)
* Add alsa & microphone support for keyword spotting

* Add python wrapper
2024-03-01 09:31:11 +08:00
Fangjun Kuang
8b7928e7d6 Fix computing features for whisper. (#617) 2024-02-29 16:56:29 +08:00
Karel Vesely
38c072dcb2 Track token scores (#571)
* add export of per-token scores (ys, lm, context)

- for best path of the modified-beam-search decoding of transducer

* refactoring JSON export of OnlineRecognitionResult, extending pybind11 API of OnlineRecognitionResult

* export per-token scores also for greedy-search (online-transducer)

- export un-scaled lm_probs (modified-beam search, online-transducer)
- polishing

* fill lm_probs/context_scores only if LM/ContextGraph is present (make Result smaller)
2024-02-29 06:28:45 +08:00
Fangjun Kuang
85d59b5840 Use hub.nuaa.cf to replace huggingface URL to download dependencies. (#614) 2024-02-28 17:48:51 +08:00
Fangjun Kuang
0cb6d1b474 support using xnnpack as execution provider (#612) 2024-02-28 17:32:48 +08:00
Fangjun Kuang
87a7030c08 Support using alsa to access the microphone with non-streaming ASR models (#517) 2024-02-26 21:17:26 +08:00
Fangjun Kuang
fb04366179 Fix #608 (#610)
Fix java tests.
2024-02-26 13:49:37 +08:00
Fangjun Kuang
ee37d9bd92 Support RISC-V (#609) 2024-02-26 06:57:18 +08:00
Fangjun Kuang
67acd34dcd Use alsa to read microphone in speaker identification demo. (#605) 2024-02-23 19:27:51 +08:00
Fangjun Kuang
16ba7e274a Add WebAssembly for ASR (#604) 2024-02-23 17:39:11 +08:00
Fangjun Kuang
a2df3535b7 Install wasm tts in a separate directory (#600) 2024-02-22 11:30:08 +08:00
Fangjun Kuang
7c22398dd8 Publish wasm tts to model scope. (#599) 2024-02-22 09:57:05 +08:00
Fangjun Kuang
7c4b59932a Refactor WebAssembly build script. (#598)
Make it easier to build WebAssembly for ASR.
2024-02-21 16:51:15 +08:00
Fangjun Kuang
25079b5c05 Fix CI tests. (#596) 2024-02-21 15:37:27 +08:00
Fangjun Kuang
099a0ccae3 Link the math lib. (#592) 2024-02-21 15:36:54 +08:00
Fangjun Kuang
65eff9a6d1 Download ios-onnxruntime from github instead of huggingface. (#593) 2024-02-21 10:51:41 +08:00
Askars
763a51486e Add missing start_time to python API (#591)
Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv>
2024-02-20 20:47:53 +08:00
Fangjun Kuang
12e5225401 Fix CI warnings (#590) 2024-02-20 15:28:47 +08:00
Fangjun Kuang
d2cc48ded5 Add more Chinese TTS models (Mandarin and Cantonese) (#589) 2024-02-20 15:05:35 +08:00
Fangjun Kuang
5f075d0fce Support MinSizeRel and RelWithDebInfo build on Windows. (#586) 2024-02-20 10:22:02 +08:00
Fangjun Kuang
3d2c7fad74 Increase the right chunk size of streaming paraformer to 3 (#588) 2024-02-20 09:44:40 +08:00
Fangjun Kuang
c68f39bd3c Use onnxruntime static lib compiled with gcc8 on ubuntu 20.04 (#587) 2024-02-20 09:31:37 +08:00
Fangjun Kuang
2ab1fa022d Download android onnxruntime libs from github. (#584)
It does not need to use git lfs any longer.
2024-02-19 10:32:58 +08:00
Paolo
92a8fd64f0 updated the icon on TTS engine for android (#579) 2024-02-19 10:25:01 +08:00
Fangjun Kuang
64007a6193 Support building debug version on Windows (#583) 2024-02-18 10:39:55 +08:00
Fangjun Kuang
81da0fb7a6 Update onnxruntime from 1.16.3 to 1.17.0 (#581) 2024-02-17 12:43:42 +08:00
Fangjun Kuang
d771762868 Support WebAssembly for text-to-speech (#577) 2024-02-08 23:39:12 +08:00
Fangjun Kuang
324a265523 Update README (#572) 2024-02-03 09:20:08 +08:00
ductranminh
665b869f03 Add context biasing for mobile (#568) 2024-02-01 21:33:22 +08:00
Fangjun Kuang
558f5e3263 Use sequential layout for OfflineTtsConfig in C# (#567) 2024-02-01 16:06:32 +08:00
Fangjun Kuang
2e8b321210 Add fine-tuned whisper model on aishell (#565)
See also https://github.com/k2-fsa/icefall/pull/1466
2024-01-31 17:23:42 +08:00
Fangjun Kuang
0b18ccfbb2 C++ API demo for speaker identification with portaudio. (#561) 2024-01-30 11:21:43 +08:00
20246688
0aa47e5ccc Update test.py (#560) 2024-01-29 17:30:44 +08:00
Fangjun Kuang
be84932f86 Use curl to replace wget for Windows. (#558)
wget is not available on Windows in GitHub actions
2024-01-29 10:46:34 +08:00
Fangjun Kuang
fa2af5dc69 Add TTS demo for C# API (#557) 2024-01-28 23:29:39 +08:00
Fangjun Kuang
035a82df33 Add a new Persian tts model (#555) 2024-01-27 20:47:54 +08:00
Fangjun Kuang
44efff4e47 Fix CI tests for Python and JNI. (#554) 2024-01-27 13:01:54 +08:00