This repository has been archived on 2025-08-26. You can view files and clone it, but cannot push or open issues or pull requests.
Files
enginex-mr_series-sherpa-onnx/CHANGELOG.md
2024-12-24 11:50:58 +08:00

313 lines
11 KiB
Markdown

## 1.10.36
* Update AAR version in Android Java demo (#1618)
* Support linking onnxruntime statically for Android (#1619)
* Update readme to include Open-LLM-VTuber (#1622)
* Rename maxNumStences to maxNumSentences (#1625)
* Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630)
* Update readme to include jetson orin nx and nano b01 (#1631)
* feat: add checksum action (#1632)
* Support decoding with byte-level BPE (bbpe) models. (#1633)
* feat: enable c api for android ci (#1635)
* Update README.md (#1640)
* SherpaOnnxVadAsr: Offload runSecondPass to background thread for improved real-time audio processing (#1638)
* Fix GitHub actions. (#1642)
## 1.10.35
* Add missing changes about speaker identfication demo for HarmonyOS (#1612)
* Provide sherpa-onnx.aar for Android (#1615)
* Use aar in Android Java demo. (#1616)
## 1.10.34
* Fix building node-addon package (#1598)
* Update doc links for HarmonyOS (#1601)
* Add on-device real-time ASR demo for HarmonyOS (#1606)
* Add speaker identification APIs for HarmonyOS (#1607)
* Add speaker identification demo for HarmonyOS (#1608)
* Add speaker diarization API for HarmonyOS. (#1609)
* Add speaker diarization demo for HarmonyOS (#1610)
## 1.10.33
* Add non-streaming ASR support for HarmonyOS. (#1564)
* Add streaming ASR support for HarmonyOS. (#1565)
* Fix building for Android (#1568)
* Publish `sherpa_onnx.har` for HarmonyOS (#1572)
* Add VAD+ASR demo for HarmonyOS (#1573)
* Fix publishing har packages for HarmonyOS (#1576)
* Add CI to build HAPs for HarmonyOS (#1578)
* Add microphone demo about VAD+ASR for HarmonyOS (#1581)
* Fix getting microphone permission for HarmonyOS VAD+ASR example (#1582)
* Add HarmonyOS support for text-to-speech. (#1584)
* Fix: support both old and new websockets request headers format (#1588)
* Add on-device tex-to-speech (TTS) demo for HarmonyOS (#1590)
## 1.10.32
* Support cross-compiling for HarmonyOS (#1553)
* HarmonyOS support for VAD. (#1561)
* Fix publishing flutter iOS app to appstore (#1563).
## 1.10.31
* Publish pre-built wheels for Python 3.13 (#1485)
* Publish pre-built macos xcframework (#1490)
* Fix reading tokens.txt on Windows. (#1497)
* Add two-pass ASR Android APKs for Moonshine models. (#1499)
* Support building GPU-capable sherpa-onnx on Linux aarch64. (#1500)
* Publish pre-built wheels with CUDA support for Linux aarch64. (#1507)
* Export the English TTS model from MeloTTS (#1509)
* Add Lazarus example for Moonshine models. (#1532)
* Add isolate_tts demo (#1529)
* Add WebAssembly example for VAD + Moonshine models. (#1535)
* Add Android APK for streaming Paraformer ASR (#1538)
* Support static build for windows arm64. (#1539)
* Use xcframework for Flutter iOS plugin to support iOS simulators.
## 1.10.30
* Fix building node-addon for Windows x86. (#1469)
* Begin to support https://github.com/usefulsensors/moonshine (#1470)
* Publish pre-built JNI libs for Linux aarch64 (#1472)
* Add C++ runtime and Python APIs for Moonshine models (#1473)
* Add Kotlin and Java API for Moonshine models (#1474)
* Add C and C++ API for Moonshine models (#1476)
* Add Swift API for Moonshine models. (#1477)
* Add Go API examples for adding punctuations to text. (#1478)
* Add Go API for Moonshine models (#1479)
* Add JavaScript API for Moonshine models (#1480)
* Add Dart API for Moonshine models. (#1481)
* Add Pascal API for Moonshine models (#1482)
* Add C# API for Moonshine models. (#1483)
## 1.10.29
* Add Go API for offline punctuation models (#1434)
* Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437)
* Add more models for speaker diarization (#1440)
* Add Java API example for hotwords. (#1442)
* Add java android demo (#1454)
* Add C++ API for streaming ASR. (#1455)
* Add C++ API for non-streaming ASR (#1456)
* Handle NaN embeddings in speaker diarization. (#1461)
* Add speaker identification with VAD and non-streaming ASR using ALSA (#1463)
* Support GigaAM CTC models for Russian ASR (#1464)
* Add GigaAM NeMo transducer model for Russian ASR (#1467)
## 1.10.28
* Fix swift example for generating subtitles. (#1362)
* Allow more online models to load tokens file from the memory (#1352)
* Fix CI errors introduced by supporting loading keywords from buffers (#1366)
* Fix running MeloTTS models on GPU. (#1379)
* Support Parakeet models from NeMo (#1381)
* Export Pyannote speaker segmentation models to onnx (#1382)
* Support Agglomerative clustering. (#1384)
* Add Python API for clustering (#1385)
* support whisper turbo (#1390)
* context_state is not set correctly when previous context is passed after reset (#1393)
* Speaker diarization example with onnxruntime Python API (#1395)
* C++ API for speaker diarization (#1396)
* Python API for speaker diarization. (#1400)
* C API for speaker diarization (#1402)
* docs(nodejs-addon-examples): add guide for pnpm user (#1401)
* Go API for speaker diarization (#1403)
* Swift API for speaker diarization (#1404)
* Update readme to include more external projects using sherpa-onnx (#1405)
* C# API for speaker diarization (#1407)
* JavaScript API (node-addon) for speaker diarization (#1408)
* WebAssembly exmaple for speaker diarization (#1411)
* Handle audio files less than 10s long for speaker diarization. (#1412)
* JavaScript API with WebAssembly for speaker diarization (#1414)
* Kotlin API for speaker diarization (#1415)
* Java API for speaker diarization (#1416)
* Dart API for speaker diarization (#1418)
* Pascal API for speaker diarization (#1420)
* Android JNI support for speaker diarization (#1421)
* Android demo for speaker diarization (#1423)
## 1.10.27
* Add non-streaming ONNX models for Russian ASR (#1358)
* Fix building Flutter TTS examples for Linux (#1356)
* Support passing utf-8 strings from JavaScript to C++. (#1355)
* Fix sherpa_onnx.go to support returning empty recognition results (#1353)
## 1.10.26
* Add links to projects using sherpa-onnx. (#1345)
* Support lang/emotion/event results from SenseVoice in Swift API. (#1346)
* Support specifying max speech duration for VAD. (#1348)
* Add APIs about max speech duration in VAD for various programming languages (#1349)
## 1.10.25
* Allow tokens and hotwords to be loaded from buffered string driectly (#1339)
* Fix computing features for CED audio tagging models. (#1341)
* Preserve previous result as context for next segment (#1335)
* Add Python binding for online punctuation models (#1312)
* Fix vad.Flush(). (#1329)
* Fix wasm app for streaming paraformer (#1328)
* Build websocket related binaries for embedded systems. (#1327)
* Fixed the C api calls and created the TTS project file (#1324)
* Re-implement LM rescore for online transducer (#1231)
## 1.10.24
* Add VAD and keyword spotting for the Node package with WebAssembly (#1286)
* Fix releasing npm package and fix building Android VAD+ASR example (#1288)
* add Tokens []string, Timestamps []float32, Lang string, Emotion string, Event string (#1277)
* add vad+sense voice example for C API (#1291)
* ADD VAD+ASR example for dart with CircularBuffer. (#1293)
* Fix VAD+ASR example for Dart API. (#1294)
* Avoid SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches freeing null. (#1296)
* Fix releasing wasm app for vad+asr (#1300)
* remove extra files from linux/macos/windows jni libs (#1301)
* two-pass Android APK for SenseVoice (#1302)
* Downgrade flutter sdk versions. (#1305)
* Reduce onnxruntime log output. (#1306)
* Provide prebuilt .jar files for different java versions. (#1307)
## 1.10.23
* flutter: add lang, emotion, event to OfflineRecognizerResult (#1268)
* Use a separate thread to initialize models for lazarus examples. (#1270)
* Object pascal examples for recording and playing audio with portaudio. (#1271)
* Text to speech API for Object Pascal. (#1273)
* update kotlin api for better release native object and add user-friendly apis. (#1275)
* Update wave-reader.cc to support 8/16/32-bit waves (#1278)
* Add WebAssembly for VAD (#1281)
* WebAssembly example for VAD + Non-streaming ASR (#1284)
## 1.10.22
* Add Pascal API for reading wave files (#1243)
* Pascal API for streaming ASR (#1246)
* Pascal API for non-streaming ASR (#1247)
* Pascal API for VAD (#1249)
* Add more C API examples (#1255)
* Add emotion, event of SenseVoice. (#1257)
* Support reading multi-channel wave files with 8/16/32-bit encoded samples (#1258)
* Enable IPO only for Release build. (#1261)
* Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR (#1251)
* Fix looking up OOVs in lexicon.txt for MeloTTS models. (#1266)
## 1.10.21
* Fix ffmpeg c api example (#1185)
* Fix splitting sentences for MeloTTS (#1186)
* Non-streaming WebSocket client for Java. (#1190)
* Fix copying asset files for flutter examples. (#1191)
* Add Chinese+English tts example for flutter (#1192)
* Add speaker identification and verification exmaple for Dart API (#1194)
* Fix reading non-standard wav files. (#1199)
* Add ReazonSpeech Japanese pre-trained model (#1203)
* Describe how to add new words for MeloTTS models (#1209)
* Remove libonnxruntime_providers_cuda.so as a dependency. (#1210)
* Fix setting SenseVoice language. (#1214)
* Support passing TTS callback in Swift API (#1218)
* Add MeloTTS example for ios (#1223)
* Add online punctuation and casing prediction model for English language (#1224)
* Fix python two pass ASR examples (#1230)
* Add blank penalty for various language bindings
## 1.10.20
* Add Dart API for audio tagging
* Add Dart API for adding punctuations to text
## 1.10.19
* Prefix all C API functions with SherpaOnnx
## 1.10.18
* Fix the case when recognition results contain the symbol `"`. It caused
issues when converting results to a json string.
## 1.10.17
* Support SenseVoice CTC models.
* Add Dart API for keyword spotter.
## 1.10.16
* Support zh-en TTS model from MeloTTS.
## 1.10.15
* Downgrade onnxruntime from v1.18.1 to v1.17.1
## 1.10.14
* Support whisper large v3
* Update onnxruntime from v1.18.0 to v1.18.1
* Fix invalid utf8 sequence from Whisper for Dart API.
## 1.10.13
* Update onnxruntime from 1.17.1 to 1.18.0
* Add C# API for Keyword spotting
## 1.10.12
* Add Flush to VAD so that the last speech segment can be detected. See also
https://github.com/k2-fsa/sherpa-onnx/discussions/1077#discussioncomment-9979740
## 1.10.11
* Support the iOS platform for Flutter.
## 1.10.10
* Build sherpa-onnx into a single shared library.
## 1.10.9
* Fix released packages. piper-phonemize was not included in v1.10.8.
## 1.10.8
* Fix released packages. There should be a lib directory.
## 1.10.7
* Support Android for Flutter.
## 1.10.2
* Fix passing C# string to C++
## 1.10.1
* Enable to stop TTS generation
## 1.10.0
* Add inverse text normalization
## 1.9.30
* Add TTS
## 1.9.29
* Publish with CI
## 0.0.3
* Fix path separator on Windows.
## 0.0.2
* Support specifying lib path.
## 0.0.1
* Initial release.