Commit Graph

  • 2ededa7e98 Fix building wasm in CI (#720) Fangjun Kuang 2024-03-31 20:50:56 +08:00
  • 43af1e6951 Release v1.9.15 (#719) Fangjun Kuang 2024-03-29 19:58:04 +08:00
  • 6da4a1c12f Add Go API for speaker identification (#718) Fangjun Kuang 2024-03-29 19:25:55 +08:00
  • 2e0bccad36 Add C API for speaker embedding extractor. (#711) Fangjun Kuang 2024-03-28 18:05:40 +08:00
  • 638f48f47a Added progress for callback of tts generator (#712) Leo Huang 2024-03-28 17:12:20 +08:00
  • de655e838e delete incorrect logs (#714) longshiming 2024-03-28 10:49:45 +08:00
  • 559744ac60 Fix ios-swift to remove invalid references (#713) Fangjun Kuang 2024-03-28 09:39:43 +08:00
  • a042f44076 Add Golang API for spoken language identification. (#709) Fangjun Kuang 2024-03-27 19:40:25 +08:00
  • 12efbf7397 Sign released TTS APKs (#710) Fangjun Kuang 2024-03-27 19:34:37 +08:00
  • 69c7880c4d Add Golang API for VAD (#708) Fangjun Kuang 2024-03-27 12:09:39 +08:00
  • ccb2d435ec add openfst.cmake file (#707) hantengc 2024-03-27 11:31:26 +08:00
  • 4e040c596e Support including TTS conditionally. (#699) Fangjun Kuang 2024-03-26 17:21:35 +08:00
  • bd66f7a7d0 Build Android TTS APKs for coqui-ai/TTS models (#704) Fangjun Kuang 2024-03-26 14:05:26 +08:00
  • d364610605 Use a single thread when loading models (#703) Fangjun Kuang 2024-03-26 13:35:33 +08:00
  • 305c373107 Add C# API for spoken language identification (#697) Fangjun Kuang 2024-03-25 18:45:09 +08:00
  • 83a10a55a5 Add Swift API for spoken language identification. (#696) Fangjun Kuang 2024-03-25 16:22:25 +08:00
  • ab7cff2513 Add C API for spoken language identification. (#695) Fangjun Kuang 2024-03-25 15:16:47 +08:00
  • 0d258dd150 Support spoken language identification with whisper (#694) Fangjun Kuang 2024-03-24 22:57:00 +08:00
  • 3cdad9b5d1 Use manylinux in CI test (#692) Fangjun Kuang 2024-03-24 07:54:32 +08:00
  • e60c897ce7 Update MainActivity.kt (#693) Masoud 2024-03-24 02:59:14 +03:30
  • 1952772654 Add timestamps and tokens for .Net's online models. (#690) Fangjun Kuang 2024-03-23 18:51:56 +08:00
  • e6da2c5556 Fix build c api examples with alsa (#691) Fangjun Kuang 2024-03-23 16:16:24 +08:00
  • eaec4c83c2 Configurable low_freq high_freq, dithering (#664) Karel Vesely 2024-03-22 14:41:44 +01:00
  • 2fc1201924 Add hotwords support to .Net (#689) Fangjun Kuang 2024-03-22 21:40:42 +08:00
  • 24f437a6f1 Refactor github actions tests (#688) Fangjun Kuang 2024-03-22 21:22:42 +08:00
  • 1c77457d61 Update MainActivity.kt (#687) Masoud 2024-03-22 14:34:14 +03:30
  • c8770aec20 Add nuget package for Windows x86 (#683) Fangjun Kuang 2024-03-21 14:57:01 +08:00
  • acf0975153 Support whisper language/task in various language bindings. (#679) Fangjun Kuang 2024-03-20 16:43:35 +08:00
  • 842d04d7ae support whisper language (#678) Viggo 2024-03-20 10:16:22 +08:00
  • 6571fc9552 Add tts play example for .Net. (#676) Fangjun Kuang 2024-03-19 17:33:15 +08:00
  • ce60100f68 Add HotwordsFile and HotwordsScore fields to OnlineRecognizerConfig in C# API (#675) foreversimon 2024-03-19 15:04:08 +08:00
  • fda614d0d1 beam search value as parameter in offline_recognizer.py (#673) Bhaswati Saha 2024-03-18 16:13:05 +05:30
  • 9d6eb3e834 small fixes to wasm kws. (#672) Fangjun Kuang 2024-03-18 15:28:10 +08:00
  • 009ed2cd30 add WebAssembly for Kws (#648) Lovemefan 2024-03-11 21:02:31 +08:00
  • a628002d8f Release v1.9.12 (#661) Fangjun Kuang 2024-03-11 18:52:34 +08:00
  • 44d0ef9ae3 Print the time about the first message in tts. (#655) Fangjun Kuang 2024-03-11 11:05:42 +08:00
  • f43139e803 c++ api for keyword spotter (#642) xinhecuican 2024-03-11 10:23:46 +08:00
  • 1777a5dd88 Use onnxruntime 1.17.1 for iOS. (#654) Fangjun Kuang 2024-03-10 14:26:36 +08:00
  • 3232dff2cf Support user provided data in tts callback. (#653) Fangjun Kuang 2024-03-09 18:15:03 +08:00
  • ac43c2d7b6 Expose 'language' 'task' 'tailPaddings' in OfflineWhisperModelConfig (#643) GaryLaurenceauAva 2024-03-08 12:52:30 +01:00
  • 4b708e055c Add microphone streaming ASR example for C API (#650) Fangjun Kuang 2024-03-08 19:31:46 +08:00
  • d3287f9494 Add Python ASR examples with alsa (#646) Fangjun Kuang 2024-03-08 11:34:48 +08:00
  • e9e8d755d9 Fix detetion at the tail when using hotwords in streaming model (#638) Wei Kang 2024-03-08 10:04:33 +08:00
  • f70fdd156c Support using T-head-Semi/csi-nn2 for RISC-V (#637) Fangjun Kuang 2024-03-06 18:21:50 +08:00
  • bdf9243940 Allow to not use pre-installed onnxruntime libs. (#636) Fangjun Kuang 2024-03-06 14:40:23 +08:00
  • 13260cdf49 Use self-compiled onnxruntime shared lib. (#635) Fangjun Kuang 2024-03-06 11:03:24 +08:00
  • 5dc2eaf2b4 Fix building wheels from source. (#632) Fangjun Kuang 2024-03-04 16:39:51 +08:00
  • ed06ced16f Add WebAssembly for NodeJS. (#628) Fangjun Kuang 2024-03-03 20:00:36 +08:00
  • ac6825ff11 Refactor WebAssembly for nodejs (#626) Fangjun Kuang 2024-03-02 12:31:36 +08:00
  • a65643b594 support onnxruntime v1.17.1 (#624) Fangjun Kuang 2024-03-02 11:44:59 +08:00
  • d56964371c Support VITS models from icefall. (#625) Fangjun Kuang 2024-03-01 19:48:38 +08:00
  • 93836ff451 fixed variable's spell num_trailing_blanks (#623) dragon10 2024-03-01 17:02:10 +08:00
  • e2397cd1a4 Support Android NNAPI. (#622) Fangjun Kuang 2024-03-01 16:39:48 +08:00
  • f9db33c926 Add WebAssembly demo for streaming trilingual Paraformer (Chinese+Cantonese+English) (#618) Fangjun Kuang 2024-03-01 15:20:56 +08:00
  • c093880d7c Fix building wheels (#620) Fangjun Kuang 2024-03-01 15:20:06 +08:00
  • 734bbd91dc Add Python API for keyword spotting (#576) Wei Kang 2024-03-01 09:31:11 +08:00
  • 8b7928e7d6 Fix computing features for whisper. (#617) Fangjun Kuang 2024-02-29 16:56:29 +08:00
  • 38c072dcb2 Track token scores (#571) Karel Vesely 2024-02-28 23:28:45 +01:00
  • 85d59b5840 Use hub.nuaa.cf to replace huggingface URL to download dependencies. (#614) Fangjun Kuang 2024-02-28 17:48:51 +08:00
  • 0cb6d1b474 support using xnnpack as execution provider (#612) Fangjun Kuang 2024-02-28 17:32:48 +08:00
  • 87a7030c08 Support using alsa to access the microphone with non-streaming ASR models (#517) Fangjun Kuang 2024-02-26 21:17:26 +08:00
  • fb04366179 Fix #608 (#610) Fangjun Kuang 2024-02-26 13:49:37 +08:00
  • ee37d9bd92 Support RISC-V (#609) Fangjun Kuang 2024-02-26 06:57:18 +08:00
  • 67acd34dcd Use alsa to read microphone in speaker identification demo. (#605) Fangjun Kuang 2024-02-23 19:27:51 +08:00
  • 16ba7e274a Add WebAssembly for ASR (#604) Fangjun Kuang 2024-02-23 17:39:11 +08:00
  • a2df3535b7 Install wasm tts in a separate directory (#600) Fangjun Kuang 2024-02-22 11:30:08 +08:00
  • 7c22398dd8 Publish wasm tts to model scope. (#599) Fangjun Kuang 2024-02-22 09:57:05 +08:00
  • 7c4b59932a Refactor WebAssembly build script. (#598) Fangjun Kuang 2024-02-21 16:51:15 +08:00
  • 25079b5c05 Fix CI tests. (#596) Fangjun Kuang 2024-02-21 15:37:27 +08:00
  • 099a0ccae3 Link the math lib. (#592) Fangjun Kuang 2024-02-21 15:36:54 +08:00
  • 65eff9a6d1 Download ios-onnxruntime from github instead of huggingface. (#593) Fangjun Kuang 2024-02-21 10:51:41 +08:00
  • 763a51486e Add missing start_time to python API (#591) Askars 2024-02-20 14:47:53 +02:00
  • 12e5225401 Fix CI warnings (#590) Fangjun Kuang 2024-02-20 15:28:47 +08:00
  • d2cc48ded5 Add more Chinese TTS models (Mandarin and Cantonese) (#589) Fangjun Kuang 2024-02-20 15:05:35 +08:00
  • 5f075d0fce Support MinSizeRel and RelWithDebInfo build on Windows. (#586) Fangjun Kuang 2024-02-20 10:22:02 +08:00
  • 3d2c7fad74 Increase the right chunk size of streaming paraformer to 3 (#588) Fangjun Kuang 2024-02-20 09:44:40 +08:00
  • c68f39bd3c Use onnxruntime static lib compiled with gcc8 on ubuntu 20.04 (#587) Fangjun Kuang 2024-02-20 09:31:37 +08:00
  • 2ab1fa022d Download android onnxruntime libs from github. (#584) Fangjun Kuang 2024-02-19 10:32:58 +08:00
  • 92a8fd64f0 updated the icon on TTS engine for android (#579) Paolo 2024-02-19 03:25:01 +01:00
  • 64007a6193 Support building debug version on Windows (#583) Fangjun Kuang 2024-02-18 10:39:55 +08:00
  • 81da0fb7a6 Update onnxruntime from 1.16.3 to 1.17.0 (#581) Fangjun Kuang 2024-02-17 12:43:42 +08:00
  • d771762868 Support WebAssembly for text-to-speech (#577) Fangjun Kuang 2024-02-08 23:39:12 +08:00
  • 324a265523 Update README (#572) Fangjun Kuang 2024-02-03 09:20:08 +08:00
  • 665b869f03 Add context biasing for mobile (#568) ductranminh 2024-02-01 20:33:22 +07:00
  • 558f5e3263 Use sequential layout for OfflineTtsConfig in C# (#567) Fangjun Kuang 2024-02-01 16:06:32 +08:00
  • 2e8b321210 Add fine-tuned whisper model on aishell (#565) Fangjun Kuang 2024-01-31 17:23:42 +08:00
  • 0b18ccfbb2 C++ API demo for speaker identification with portaudio. (#561) Fangjun Kuang 2024-01-30 11:21:43 +08:00
  • 0aa47e5ccc Update test.py (#560) 20246688 2024-01-29 17:30:44 +08:00
  • be84932f86 Use curl to replace wget for Windows. (#558) Fangjun Kuang 2024-01-29 10:46:34 +08:00
  • fa2af5dc69 Add TTS demo for C# API (#557) Fangjun Kuang 2024-01-28 23:29:39 +08:00
  • 035a82df33 Add a new Persian tts model (#555) Fangjun Kuang 2024-01-27 20:47:54 +08:00
  • 44efff4e47 Fix CI tests for Python and JNI. (#554) Fangjun Kuang 2024-01-27 13:01:54 +08:00
  • 7ae73e75ba Run TTS engine service without starting the app. (#553) Fangjun Kuang 2024-01-26 22:28:21 +08:00
  • 4fbad6a368 Ensure input for speaker ID is a valid number. (#552) Fangjun Kuang 2024-01-26 20:42:10 +08:00
  • 3f2a17ef47 Fixes issue #535 , fix hexa 1-char tokens in ASR output. (#550) Karel Vesely 2024-01-26 12:23:20 +01:00
  • e7b18a2139 add blank_penalty for online transducer (#548) chiiyeh 2024-01-26 12:12:13 +08:00
  • 466a6855c8 add hotwords docstring to offline_recognizer and online_recognizer (#546) chiiyeh 2024-01-25 16:54:20 +08:00
  • 3bb3849ec5 add blank_penalty for offline transducer (#542) chiiyeh 2024-01-25 15:00:09 +08:00
  • a9e7747736 Fix cmake variables to point to the project root directory. (#545) Fangjun Kuang 2024-01-24 19:21:23 +08:00
  • 2ff1049079 change modelscope link to github for build-kws-apki (#540) Wei Kang 2024-01-24 16:40:14 +08:00