Fangjun Kuang
ee37d9bd92
Support RISC-V ( #609 )
2024-02-26 06:57:18 +08:00
Fangjun Kuang
67acd34dcd
Use alsa to read microphone in speaker identification demo. ( #605 )
2024-02-23 19:27:51 +08:00
Fangjun Kuang
16ba7e274a
Add WebAssembly for ASR ( #604 )
2024-02-23 17:39:11 +08:00
Fangjun Kuang
a2df3535b7
Install wasm tts in a separate directory ( #600 )
2024-02-22 11:30:08 +08:00
Fangjun Kuang
7c22398dd8
Publish wasm tts to model scope. ( #599 )
2024-02-22 09:57:05 +08:00
Fangjun Kuang
7c4b59932a
Refactor WebAssembly build script. ( #598 )
...
Make it easier to build WebAssembly for ASR.
2024-02-21 16:51:15 +08:00
Fangjun Kuang
25079b5c05
Fix CI tests. ( #596 )
2024-02-21 15:37:27 +08:00
Fangjun Kuang
099a0ccae3
Link the math lib. ( #592 )
2024-02-21 15:36:54 +08:00
Fangjun Kuang
65eff9a6d1
Download ios-onnxruntime from github instead of huggingface. ( #593 )
2024-02-21 10:51:41 +08:00
Askars
763a51486e
Add missing start_time to python API ( #591 )
...
Co-authored-by: vsd-vector <askars.salimbajevs@tilde.lv >
2024-02-20 20:47:53 +08:00
Fangjun Kuang
12e5225401
Fix CI warnings ( #590 )
2024-02-20 15:28:47 +08:00
Fangjun Kuang
d2cc48ded5
Add more Chinese TTS models (Mandarin and Cantonese) ( #589 )
2024-02-20 15:05:35 +08:00
Fangjun Kuang
5f075d0fce
Support MinSizeRel and RelWithDebInfo build on Windows. ( #586 )
2024-02-20 10:22:02 +08:00
Fangjun Kuang
3d2c7fad74
Increase the right chunk size of streaming paraformer to 3 ( #588 )
2024-02-20 09:44:40 +08:00
Fangjun Kuang
c68f39bd3c
Use onnxruntime static lib compiled with gcc8 on ubuntu 20.04 ( #587 )
2024-02-20 09:31:37 +08:00
Fangjun Kuang
2ab1fa022d
Download android onnxruntime libs from github. ( #584 )
...
It does not need to use git lfs any longer.
2024-02-19 10:32:58 +08:00
Paolo
92a8fd64f0
updated the icon on TTS engine for android ( #579 )
2024-02-19 10:25:01 +08:00
Fangjun Kuang
64007a6193
Support building debug version on Windows ( #583 )
2024-02-18 10:39:55 +08:00
Fangjun Kuang
81da0fb7a6
Update onnxruntime from 1.16.3 to 1.17.0 ( #581 )
2024-02-17 12:43:42 +08:00
Fangjun Kuang
d771762868
Support WebAssembly for text-to-speech ( #577 )
2024-02-08 23:39:12 +08:00
Fangjun Kuang
324a265523
Update README ( #572 )
2024-02-03 09:20:08 +08:00
ductranminh
665b869f03
Add context biasing for mobile ( #568 )
2024-02-01 21:33:22 +08:00
Fangjun Kuang
558f5e3263
Use sequential layout for OfflineTtsConfig in C# ( #567 )
2024-02-01 16:06:32 +08:00
Fangjun Kuang
2e8b321210
Add fine-tuned whisper model on aishell ( #565 )
...
See also https://github.com/k2-fsa/icefall/pull/1466
2024-01-31 17:23:42 +08:00
Fangjun Kuang
0b18ccfbb2
C++ API demo for speaker identification with portaudio. ( #561 )
2024-01-30 11:21:43 +08:00
20246688
0aa47e5ccc
Update test.py ( #560 )
2024-01-29 17:30:44 +08:00
Fangjun Kuang
be84932f86
Use curl to replace wget for Windows. ( #558 )
...
wget is not available on Windows in GitHub actions
2024-01-29 10:46:34 +08:00
Fangjun Kuang
fa2af5dc69
Add TTS demo for C# API ( #557 )
2024-01-28 23:29:39 +08:00
Fangjun Kuang
035a82df33
Add a new Persian tts model ( #555 )
2024-01-27 20:47:54 +08:00
Fangjun Kuang
44efff4e47
Fix CI tests for Python and JNI. ( #554 )
2024-01-27 13:01:54 +08:00
Fangjun Kuang
7ae73e75ba
Run TTS engine service without starting the app. ( #553 )
2024-01-26 22:28:21 +08:00
Fangjun Kuang
4fbad6a368
Ensure input for speaker ID is a valid number. ( #552 )
...
Fix #547
2024-01-26 20:42:10 +08:00
Karel Vesely
3f2a17ef47
Fixes issue #535 , fix hexa 1-char tokens in ASR output. ( #550 )
...
- Avoid output like : `[' K', '<0x64>', '<0x79>', 'ť', ' a', '<0x75>',
'to', 'bu', '<0x73>', '<0x75>', ... ]` with regular 500 BPE units.
- Don't rewrite 1-char tokens in range [ 0x20 (space) .. 0x7E (tilde) ]
2024-01-26 19:23:20 +08:00
chiiyeh
e7b18a2139
add blank_penalty for online transducer ( #548 )
2024-01-26 12:12:13 +08:00
chiiyeh
466a6855c8
add hotwords docstring to offline_recognizer and online_recognizer ( #546 )
2024-01-25 16:54:20 +08:00
chiiyeh
3bb3849ec5
add blank_penalty for offline transducer ( #542 )
2024-01-25 15:00:09 +08:00
Fangjun Kuang
a9e7747736
Fix cmake variables to point to the project root directory. ( #545 )
2024-01-24 19:21:23 +08:00
Wei Kang
2ff1049079
change modelscope link to github for build-kws-apki ( #540 )
2024-01-24 16:40:14 +08:00
Fangjun Kuang
bbd7c7fc18
Add Android demo for speaker recognition ( #536 )
...
See pre-built Android APKs at
https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html
2024-01-23 16:50:52 +08:00
Wei Kang
626775e5e2
Change model url from modelscope to github ( #538 )
2024-01-23 10:15:58 +08:00
Wei Kang
b6c020901a
decoder for open vocabulary keyword spotting ( #505 )
...
* various fixes to ContextGraph to support open vocabulary keywords decoder
* Add keyword spotter runtime
* Add binary
* First version works
* Minor fixes
* update text2token
* default values
* Add jni for kws
* add kws android project
* Minor fixes
* Remove unused interface
* Minor fixes
* Add workflow
* handle extra info in texts
* Minor fixes
* Add more comments
* Fix ci
* fix cpp style
* Add input box in android demo so that users can specify their keywords
* Fix cpp style
* Fix comments
* Minor fixes
* Minor fixes
* minor fixes
* Minor fixes
* Minor fixes
* Add CI
* Fix code style
* cpplint
* Fix comments
* Fix error
2024-01-20 22:52:41 +08:00
Fangjun Kuang
bf1dd3daf6
Refactor the UI of Android TTS engine ( #533 )
2024-01-17 12:12:50 +08:00
Fangjun Kuang
59e28518b4
Add Python API examples for speaker recognition with VAD and ASR. ( #532 )
2024-01-15 21:40:30 +08:00
Fangjun Kuang
7e0ae677c8
Add a Persian and a Slovenian model from Piper for Android TTS. ( #531 )
2024-01-15 15:00:15 +08:00
Fangjun Kuang
f4e3f45664
Fix setting speaker ID for Android TTS Engine. ( #530 )
2024-01-15 11:46:57 +08:00
Fangjun Kuang
229853b77e
Android TTS APKs for Persian ( #529 )
2024-01-14 21:44:46 +08:00
Fangjun Kuang
2024e96639
Add C++ runtime for speaker verification models from NeMo ( #527 )
2024-01-13 21:42:09 +08:00
Fangjun Kuang
68a525a024
Export speaker verification models from NeMo to ONNX ( #526 )
2024-01-13 19:49:45 +08:00
Fangjun Kuang
afc81ec122
Add C++ runtime for models from 3d-speaker ( #523 )
2024-01-11 19:10:30 +08:00
Fangjun Kuang
ec728ff7f6
Fix publishing nuget packages. ( #525 )
2024-01-11 18:54:23 +08:00