init ascend tts

This commit is contained in:
2025-09-05 10:49:17 +08:00
parent d53ac91bb6
commit c5a6692774
602 changed files with 590901 additions and 1 deletions

View File

@@ -0,0 +1,626 @@
# 更新日志
## 202401
- 2024.01.21 [PR#108](https://github.com/RVC-Boss/GPT-SoVITS/pull/108)
- 内容: WebUI 增加英文系统英文翻译适配.
- 类型: 文档
- 提交: D3lik
- 2024.01.21 [Commit#7b89c9ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b89c9ed5669f63c4ed6ae791408969640bdcf3e)
- 内容: 尝试修复 SoVITS 训练报错 ZeroDivisionError 的问题.
- 类型: 修复
- 提交: RVC-Boss, Tybost
- 关联: [Issue#79](https://github.com/RVC-Boss/GPT-SoVITS/issues/79)
- 2024.01.21 [Commit#ea62d6e0](https://github.com/RVC-Boss/GPT-SoVITS/commit/ea62d6e0cf1efd75287766ea2b55d1c3b69b4fd3)
- 内容: 大幅削弱合成音频包含参考音频结尾的问题.
- 类型: 优化
- 提交: RVC-Boss
- 2024.01.21 [Commit#a87ad522](https://github.com/RVC-Boss/GPT-SoVITS/commit/a87ad5228ed2d729da42019ae1b93171f6a745ef)
- 内容: `cmd-asr.py` 添加判断默认目录内是否存在模型, 如不存在则从 ModelScope 自动下载.
- 类型: 新功能
- 提交: RVC-Boss
- 2024.01.21 [Commit#f6147116](https://github.com/RVC-Boss/GPT-SoVITS/commit/f61471166c107ba56ccb7a5137fa9d7c09b2830d)
- 内容: `Config.py` 添加 `is_share` 参数, 如 Colab 等场景可以将此改为 `True` 将 WebUI 映射到公网.
- 类型: 新功能
- 提交: RVC-Boss
- 2024.01.21 [Commit#102d5081](https://github.com/RVC-Boss/GPT-SoVITS/commit/102d50819e5d24580d6e96085b636b25533ecc7f)
- 内容: 清理 TEMP 文件夹内缓存, 音频等文件.
- 类型: 优化
- 提交: RVC-Boss
- 2024.01.22 [Commit#872134c8](https://github.com/RVC-Boss/GPT-SoVITS/commit/872134c846bcb8f1909a3f5aff68a6aa67643f68)
- 内容: 修复过短输出文件返回重复参考音频的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.22 经测试, 英文日文训练原生支持 (日文训练需要根目录不含非英文等特殊字符).
- 2024.01.22 [PR#124](https://github.com/RVC-Boss/GPT-SoVITS/pull/124)
- 内容: 音频路径检查. 如果尝试读取输入错的路径报错路径不存在, 而非 FFmpeg 错误.
- 类型: 优化
- 提交: xmimu
- 2024.01.23 [Commit#93c47cd9](https://github.com/RVC-Boss/GPT-SoVITS/commit/93c47cd9f0c53439536eada18879b4ec5a812ae1)
- 内容: 解决 HuBERT 提取 NaN 导致 SoVITS/GPT 训练报错 ZeroDivisionError 的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.23 [Commit#80fffb0a](https://github.com/RVC-Boss/GPT-SoVITS/commit/80fffb0ad46e4e7f27948d5a57c88cf342088d50)
- 内容: 中文分词使用 `jieba_fast` 代替 `jieba`.
- 类型: 优化
- 提交: RVC-Boss
- 2024.01.23 [Commit#63625758](https://github.com/RVC-Boss/GPT-SoVITS/commit/63625758a99e645f3218dd167924e01a0e3cf0dc)
- 内容: 优化模型文件排序逻辑.
- 类型: 优化
- 提交: RVC-Boss
- 2024.01.23 [Commit#0c691191](https://github.com/RVC-Boss/GPT-SoVITS/commit/0c691191e894c15686e88279745712b3c6dc232f)
- 内容: 支持推理界面快速切换模型.
- 类型: 新功能
- 提交: RVC-Boss
- 2024.01.25 [Commit#249561e5](https://github.com/RVC-Boss/GPT-SoVITS/commit/249561e5a18576010df6587c274d38cbd9e18b4b)
- 内容: 去除推理界面大量冗余日志.
- 类型: 优化
- 提交: RVC-Boss
- 2024.01.25 [PR#183](https://github.com/RVC-Boss/GPT-SoVITS/pull/183), [PR#200](https://github.com/RVC-Boss/GPT-SoVITS/pull/200)
- 内容: 支持 MacOS MPS 训练推理.
- 类型: 新功能
- 提交: Lion-Wu
- 2024.01.26 [Commit#813cf96e](https://github.com/RVC-Boss/GPT-SoVITS/commit/813cf96e508ba1bb2c658f38c7cc77b797fb4082), [Commit#2d1ddeca](https://github.com/RVC-Boss/GPT-SoVITS/commit/2d1ddeca42db90c3fe2d0cd79480fd544d87f02b)
- 内容: 修复 UVR5 读取到目录自动跳出的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.26 [PR#204](https://github.com/RVC-Boss/GPT-SoVITS/pull/204)
- 内容: 支持输出文本中英混合, 日英混合.
- 类型: 新功能
- 提交: Kakaru Hayate
- 2024.01.26 [Commit#f4148cf7](https://github.com/RVC-Boss/GPT-SoVITS/commit/f4148cf77fb899c22bcdd4e773d2f24ab34a73e7)
- 内容: 输出可选切分模式.
- 类型: 新功能
- 提交: RVC-Boss
- 2024.01.26 [Commit#9fe955c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/9fe955c1bf5f94546c9f699141281f2661c8a180)
- 内容: 修复多个换行导致推理报错.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.26 [Commit#84ee4719](https://github.com/RVC-Boss/GPT-SoVITS/commit/84ee471936b332bc2ccee024d6dfdedab4f0dc7b)
- 内容: 自动识别不支持半精度的卡强制单精度, CPU 推理下强制单精度.
- 类型: 优化
- 提交: RVC-Boss
- 2024.01.28 [PR#238](https://github.com/RVC-Boss/GPT-SoVITS/pull/238)
- 内容: 完善 Dockerfile 下载模型流程.
- 类型: 修复
- 提交: breakstring
- 2024.01.28 [PR#257](https://github.com/RVC-Boss/GPT-SoVITS/pull/257)
- 内容: 修复数字转汉字念法问题.
- 类型: 修复
- 提交: duliangang
- 2024.01.28 [Commit#f0cfe397](https://github.com/RVC-Boss/GPT-SoVITS/commit/f0cfe397089a6fd507d678c71adeaab5e7ed0683)
- 内容: 修复 GPT 训练不保存权重文件的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.28 [Commit#b8ae5a27](https://github.com/RVC-Boss/GPT-SoVITS/commit/b8ae5a2761e2654fc0c905498009d3de9de745a8)
- 内容: 排除不合理的参考音频长度.
- 类型: 优化
- 提交: RVC-Boss
- 2024.01.28 [Commit#698e9655](https://github.com/RVC-Boss/GPT-SoVITS/commit/698e9655132d194b25b86fbbc99d53c8d2cea2a3)
- 内容: 修复句首少量字容易吞字的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.29 [Commit#ff977a5f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff977a5f5dc547e0ad82b9e0f1cd95fbc830b2b0)
- 内容: 对于 16 系等半精度训练存在问题的显卡把训练配置改为单精度训练.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.29 [Commit#172e139f](https://github.com/RVC-Boss/GPT-SoVITS/commit/172e139f45ac26723bc2cf7fac0112f69d6b46ec)
- 内容: 测试更新可用的 Colab 版本.
- 类型: 新功能
- 提交: RVC-Boss
- 2024.01.29 [PR#135](https://github.com/RVC-Boss/GPT-SoVITS/pull/135)
- 内容: 更新 FunASR 为 1.0 版本并修复接口不对齐导致的报错问题.
- 类型: 修复
- 提交: LauraGPT
- 2024.01.30 [Commit#1c2fa98c](https://github.com/RVC-Boss/GPT-SoVITS/commit/1c2fa98ca8c325dcfb32797d22ff1c2a726d1cb4)
- 内容: 修复中文标点切割问题和句首句尾补标点的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.01.30 [Commit#74409f35](https://github.com/RVC-Boss/GPT-SoVITS/commit/74409f3570fa1c0ff28d4c65c288a6ce58ca00d2)
- 内容: 增加按标点符号切分.
- 类型: 新功能
- 提交: RVC-Boss
- 2024.01.30 [Commit#c42eeccf](https://github.com/RVC-Boss/GPT-SoVITS/commit/c42eeccfdd2d0a0d714ecc8bfc22a12373aca6b7)
- 内容: 所有涉及路径的位置自动去除双引号, 解决复制路径带双引号时报错的问题.
- 类型: 修复
- 提交: RVC-Boss
## 202402
- 2024.02.01 [Commit#45f73519](https://github.com/RVC-Boss/GPT-SoVITS/commit/45f73519cc41cd17cf816d8b997a9dcb0bee04b6)
- 内容: 修复 ASR 路径尾缀带有 `/` 时保存文件名报错的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.02.03 [Commit#dba1a74c](https://github.com/RVC-Boss/GPT-SoVITS/commit/dba1a74ccb0cf19a1b4eb93faf11d4ec2b1fc5d7)
- 内容: 修复 UVR5 读取格式错误导致分离失败的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.02.03 [Commit#3ebff70b](https://github.com/RVC-Boss/GPT-SoVITS/commit/3ebff70b71580ee1f97b3238c9442cbc5aef47c7)
- 内容: 支持中日英混合多种语言文本自动切分识别语种.
- 类型: 优化
- 提交: RVC-Boss
- 2024.02.03 [PR#377](https://github.com/RVC-Boss/GPT-SoVITS/pull/377)
- 内容: 引入 PaddleSpeech 的文本规范化, 修复一些问题, 例如: xx.xx%(带百分号类), 元/吨 会读成 元吨 而不是元每吨, 下划线不再会报错.
- 类型: 优化
- 提交: KamioRinn
- 2024.02.05 [PR#395](https://github.com/RVC-Boss/GPT-SoVITS/pull/395)
- 内容: 优化英语文本前端.
- 类型: 优化
- 提交: KamioRinn
- 2024.02.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/65b463a787f31637b4768cc9a47cab59541d3927)
- 内容: 修正语种传参混乱导致中文推理效果下降.
- 类型: 修复
- 提交: RVC-Boss
- 关联: [Issue#391](https://github.com/RVC-Boss/GPT-SoVITS/issues/391)
- 2024.02.06 [PR#403](https://github.com/RVC-Boss/GPT-SoVITS/pull/403)
- 内容: UVR5 适配更高版本的 Librosa.
- 类型: 修复
- 提交: StaryLan
- 2024.02.07 [Commit#14a28510](https://github.com/RVC-Boss/GPT-SoVITS/commit/14a285109a521679f8846589c22da8f656a46ad8)
- 内容: 修复 UVR5 `inf everywhere` 报错的问题 (`is_half` 传参未转换布尔类型导致恒定半精度推理, 16系显卡会 `inf`).
- 类型: 修复
- 提交: RVC-Boss
- 2024.02.07 [Commit#d74f888e](https://github.com/RVC-Boss/GPT-SoVITS/commit/d74f888e7ac86063bfeacef95d0e6ddafe42b3b2)
- 内容: 修复 Gradio 依赖.
- 类型: 修复
- 提交: RVC-Boss
- 2024.02.07 [PR#400](https://github.com/RVC-Boss/GPT-SoVITS/pull/400)
- 内容: 集成 Faster Whisper 实现对日语英语的语音识别.
- 类型: 新功能
- 提交: Shadow
- 2024.02.07 [Commit#6469048d](https://github.com/RVC-Boss/GPT-SoVITS/commit/6469048de12a8d6f0bd05d07f031309e61575a38)~[Commit#94ee71d9](https://github.com/RVC-Boss/GPT-SoVITS/commit/94ee71d9d562d10c9a1b96e745c6a6575aa66a10)
- 内容: 支持三连根目录留空自动读取 `.list` 全路径.
- 类型: 优化
- 提交: RVC-Boss
- 2024.02.08 [Commit#59f35ada](https://github.com/RVC-Boss/GPT-SoVITS/commit/59f35adad85815df27e9c6b33d420f5ebfd8376b)
- 内容: 修复 GPT 训练卡死 (win10 1909) 和系统语言繁体 GPT 训练报错.
- 类型: 修复
- 提交: RVC-Boss
- 关联: [Issue#232](https://github.com/RVC-Boss/GPT-SoVITS/issues/232)
- 2024.02.12 [PR#457](https://github.com/RVC-Boss/GPT-SoVITS/pull/457)
- 内容: 添加 DPO 损失实验性训练选项, 通过构造负样本训练缓解 GPT 重复漏字问题, 推理界面开放数个推理参数.
- 类型: 新功能
- 提交: liufenghua
- 2024.02.12 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/2fa74ecb941db27d9015583a9be6962898d66730), [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/d82f6bbb98ba725e6725dcee99b80ce71fb0bf28)
- 内容: 优化语音识别部分逻辑. Faster Whisper 转镜像站下载, 规避 HuggingFace 连接不上的问题.
- 类型: 优化
- 提交: RVC-Boss
- 2024.02.15 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/dd2c4d6d7121bf82d29d0f0e4d788f3b231997c8)
- 内容: 训练支持中文实验名称.
- 类型: 修复
- 提交: RVC-Boss
- 2024.02.15 [Commit#ccb9b08b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ccb9b08be3c58e102defcc94ff4fd609da9e27ee)~[Commit#895fde46](https://github.com/RVC-Boss/GPT-SoVITS/commit/895fde46e420040ed26aaf0c5b7e99359d9b199b)
- 内容: DPO 训练修改为可选项而非必选项, 若勾选则 Batch Size 自动减半, 修复推理界面新参数不传参的问题.
- 类型: 优化
- 提交: RVC-Boss
- 2024.02.15 [Commit#7b0c3c67](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b0c3c676495c64b2064aa472bff14b5c06206a5)
- 内容: 修复中文文本前端错误.
- 类型: 修复
- 提交: RVC-Boss
- 2024.02.16 [PR#499](https://github.com/RVC-Boss/GPT-SoVITS/pull/499)
- 内容: 支持无参考文本输入.
- 类型: 新功能
- 提交: Watchtower-Liu
- 关联: [Issue#475](https://github.com/RVC-Boss/GPT-SoVITS/issues/475)
- 2024.02.17 [PR#509](https://github.com/RVC-Boss/GPT-SoVITS/pull/509), [PR#507](https://github.com/RVC-Boss/GPT-SoVITS/pull/507), [PR#532](https://github.com/RVC-Boss/GPT-SoVITS/pull/532), [PR#556](https://github.com/RVC-Boss/GPT-SoVITS/pull/556), [PR#559](https://github.com/RVC-Boss/GPT-SoVITS/pull/559)
- 内容: 优化中文日文前端处理.
- 类型: 优化
- 提交: KamioRinn, v3cun
- 2024.02.17 [PR#510](https://github.com/RVC-Boss/GPT-SoVITS/pull/511), [PR#511](https://github.com/RVC-Boss/GPT-SoVITS/pull/511)
- 内容: 修复 Colab 不开启公网 URL 的问题.
- 类型: 修复
- 提交: ChanningWang2018, RVC-Boss
- 2024.02.21 [PR#557](https://github.com/RVC-Boss/GPT-SoVITS/pull/557)
- 内容: MacOS 推理设备从 MPS 改为 CPU (CPU 推理更快).
- 类型: 优化
- 提交: XXXXRT666
- 2024.02.21 [Commit#6da486c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/6da486c15d09e3d99fa42c5e560aaac56b6b4ce1), [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/5a17177342d2df1e11369f2f4f58d34a3feb1a35)
- 内容: 数据预处理添加语音降噪选项 (降噪为只剩 16K 采样率, 除非底噪很大否则不急使用).
- 类型: 新功能
- 提交: RVC-Boss
- 2024.02.28 [PR#573](https://github.com/RVC-Boss/GPT-SoVITS/pull/573)
- 内容: 修改 `is_half` 的判断让 MacOS 能正常 CPU 推理.
- 类型: 修复
- 提交: XXXXRT666
- 2024.02.28 [PR#610](https://github.com/RVC-Boss/GPT-SoVITS/pull/610)
- 内容: 修复 UVR5 MDXNet 参数顺序错误导致输出文件夹相反.
- 类型: 修复
- 提交: Yuze Wang
## 202403
- 2024.03.06 [PR#675](https://github.com/RVC-Boss/GPT-SoVITS/pull/675)
- 内容: Faster Whisper 在没有 CUDA 可用时自动使用 CPU 推理.
- 类型: 优化
- 提交: ShiroDoMain
- 2024.03.06 [Commit#616be20d](https://github.com/RVC-Boss/GPT-SoVITS/commit/616be20db3cf94f1cd663782fea61b2370704193)
- 内容: 使用 Faster Whisper 进行非中文语音识别时不再需要先下载 FunASR 模型.
- 类型: 优化
- 提交: RVC-Boss
- 2024.03.09 [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- 内容: 加速推理 50% (RTX3090+PyTorch2.2.1+Cuda11.8+Win10+Py39 已测试).
- 类型: 优化
- 提交: GoHomeToMacDonal
- 2024.03.10 [PR#721](https://github.com/RVC-Boss/GPT-SoVITS/pull/721)
- 内容: 新增 `fast_inference_` 快速推理分支.
- 类型: 新功能
- 提交: ChasonJiang
- 2024.03.13 [PR#761](https://github.com/RVC-Boss/GPT-SoVITS/pull/761)
- 内容: 支持 CPU 训练, 在 MacOS 上使用 CPU 训练.
- 类型: 新功能
- 提交: Lion-Wu
- 2024.03.19 [PR#804](https://github.com/RVC-Boss/GPT-SoVITS/pull/804), [PR#812](https://github.com/RVC-Boss/GPT-SoVITS/pull/812), [PR#821](https://github.com/RVC-Boss/GPT-SoVITS/pull/821)
- 内容: 优化英文 G2P 文本前端.
- 类型: 优化
- 提交: KamioRinn
- 2024.03.30 [PR#894](https://github.com/RVC-Boss/GPT-SoVITS/pull/894)
- 内容: API 格式优化.
- 类型: 优化
- 提交: KamioRinn
## 202404
- 2024.04.03 [PR#917](https://github.com/RVC-Boss/GPT-SoVITS/pull/917)
- 内容: 修复 UVR5 WebUI 调用 FFmpeg 时字符串格式.
- 类型: 修复
- 提交: StaryLan
## 202405
- 2024.05.02 [PR#953](https://github.com/RVC-Boss/GPT-SoVITS/pull/953)
- 内容: 修复 SoVITS 训练未冻结 VQ 的问题 (可能造成效果下降).
- 类型: 修复
- 提交: hcwu1993
- 关联: [Issue#747](https://github.com/RVC-Boss/GPT-SoVITS/issues/747)
- 2024.05.19 [PR#1102](https://github.com/RVC-Boss/GPT-SoVITS/pull/1102)
- 内容: 添加训练数据预处理阶段不支持的语言提示.
- 类型: 优化
- 提交: StaryLan
- 2024.05.27 [PR#1132](https://github.com/RVC-Boss/GPT-SoVITS/pull/1132)
- 内容: 修复提取 HuBERT 特征 NaN 失败自动转 FP32 出现的错误.
- 类型: 修复
- 提交: XXXXRT666
## 202406
- 2024.06.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/99f09c8bdc155c1f4272b511940717705509582a)
- 内容: 修复 WebUI 进行 GPT 中文微调时未读取 BERT 特征导致和推理不一致, 大量训练可能导致效果变差的问题. 若已使用大量数据微调, 建议重新微调模型得到质量优化.
- 类型: 修复
- 提交: RVC-Boss
- 2024.06.07 [PR#1159](https://github.com/RVC-Boss/GPT-SoVITS/pull/1159)
- 内容: 修复 S2 训练进度条逻辑.
- 类型: 修复
- 提交: pengzhendong
- 2024.06.10 [Commit#501a74ae](https://github.com/RVC-Boss/GPT-SoVITS/commit/501a74ae96789a26b48932babed5eb4e9483a232)
- 内容: 修复 UVR5 MDXNet 调用 FFmpeg 时字符串格式, 兼容带空格路径.
- 类型: 修复
- 提交: RVC-Boss
- 2024.06.10 [PR#1168](https://github.com/RVC-Boss/GPT-SoVITS/pull/1168), [PR#1169](https://github.com/RVC-Boss/GPT-SoVITS/pull/1169)
- 内容: 完善纯标点、多标点文本输入的判断逻辑.
- 类型: 修复
- 提交: XXXXRT666
- 关联: [Issue#1165](https://github.com/RVC-Boss/GPT-SoVITS/issues/1165)
- 2024.06.13 [Commit#db506705](https://github.com/RVC-Boss/GPT-SoVITS/commit/db50670598f0236613eefa6f2d5a23a271d82041)
- 内容: 修正 CPU 推理时默认 Batch Size 为小数的问题.
- 类型: 修复
- 提交: RVC-Boss
- 2024.06.28 [PR#1258](https://github.com/RVC-Boss/GPT-SoVITS/pull/1258), [PR#1265](https://github.com/RVC-Boss/GPT-SoVITS/pull/1265), [PR#1267](https://github.com/RVC-Boss/GPT-SoVITS/pull/1267)
- 内容: 修复降噪、识别时遇到异常跳出所有需处理的音频文件的问题.
- 类型: 修复
- 提交: XXXXRT666
- 2024.06.29 [Commit#a208698e](https://github.com/RVC-Boss/GPT-SoVITS/commit/a208698e775155efc95b187b746d153d0f2847ca)
- 内容: 多卡训练多进程保存逻辑修复.
- 类型: 修复
- 提交: RVC-Boss
- 2024.06.29 [PR#1251](https://github.com/RVC-Boss/GPT-SoVITS/pull/1251)
- 内容: 移除冗余 `my_utils.py`.
- 类型: 优化
- 提交: aoguai
- 关联: [Issue#1189](https://github.com/RVC-Boss/GPT-SoVITS/issues/1189)
## 202407
- 2024.07.06 [PR#1253](https://github.com/RVC-Boss/GPT-SoVITS/pull/1253)
- 内容: 修复按标点符号切分时小数会被切分.
- 类型: 修复
- 提交: aoguai
- 2024.07.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/b0786f2998f1b2fce6678434524b4e0e8cc716f5)
- 内容: 验证倍速推理代码结果和原本一致, 合并到 `main` 分支, 支持无参考文本模式.
- 类型: 优化
- 提交: RVC-Boss, GoHomeToMacDonal
- 关联: [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- 后续逐渐验证快速推理分支的推理改动的一致性.
- 2024.07.13 [PR#1294](https://github.com/RVC-Boss/GPT-SoVITS/pull/1294), [PR#1298](https://github.com/RVC-Boss/GPT-SoVITS/pull/1298)
- 内容: 重构 i18n 扫描并更新多语言配置文件.
- 类型: 文档
- 提交: StaryLan
- 2024.07.13 [PR#1299](https://github.com/RVC-Boss/GPT-SoVITS/pull/1299)
- 内容: 修复用户打文件及路径在结尾添加 `/` 会导致命令行报错的问题.
- 类型: 修复
- 提交: XXXXRT666
- 2024.07.19 [PR#756](https://github.com/RVC-Boss/GPT-SoVITS/pull/756)
- 内容: 修复训练 GPT 时采用自定义 bucket_sampler 导致步数不一致的问题.
- 类型: 修复
- 提交: huangxu1991
- 2024.07.23 [Commit#9588a3c5](https://github.com/RVC-Boss/GPT-SoVITS/commit/9588a3c52d9ebdb20b3c5d74f647d12e7c1171c2), [PR#1340](https://github.com/RVC-Boss/GPT-SoVITS/pull/1340)
- 内容: 支持合成语速调节, 支持冻结随机性只调节语速, 并将其更新到`api.py` 上.
- 类型: 新功能
- 提交: RVC-Boss, 红血球AE3803
- 2024.07.27 [PR#1306](https://github.com/RVC-Boss/GPT-SoVITS/pull/1306), [PR#1356](https://github.com/RVC-Boss/GPT-SoVITS/pull/1356)
- 内容: 增加 BS-Roformer 人声伴奏分离模型支持.
- 类型: 新功能
- 提交: KamioRinn
- 2024.07.27 [PR#1351](https://github.com/RVC-Boss/GPT-SoVITS/pull/1351)
- 内容: 更好的中文文本前端.
- 类型: 新功能
- 提交: KamioRinn
## 202408 (V2 版本)
- 2024.08.01 [PR#1355](https://github.com/RVC-Boss/GPT-SoVITS/pull/1355)
- 内容: 添加自动填充下一步文件路径的功能.
- 类型: 杂项
- 提交: XXXXRT666
- 2024.08.01 [Commit#e62e9653](https://github.com/RVC-Boss/GPT-SoVITS/commit/e62e965323a60a76a025bcaa45268c1ddcbcf05c)
- 内容: 支持 BS-Roformer 的 FP16 推理.
- 类型: 性能优化
- 提交: RVC-Boss
- 2024.08.01 [Commit#bce451a2](https://github.com/RVC-Boss/GPT-SoVITS/commit/bce451a2d1641e581e200297d01f219aeaaf7299), [Commit#4c8b7612](https://github.com/RVC-Boss/GPT-SoVITS/commit/4c8b7612206536b8b4435997acb69b25d93acb78)
- 内容: 增加用户友好逻辑, 对用户随意输入的显卡序号也能正常运行.
- 类型: 杂项
- 提交: RVC-Boss
- 2024.08.02 [Commit#ff6c193f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff6c193f6fb99d44eea3648d82ebcee895860a22)~[Commit#de7ee7c7](https://github.com/RVC-Boss/GPT-SoVITS/commit/de7ee7c7c15a2ec137feb0693b4ff3db61fad758)
- 内容: **新增 GPT-SoVITS V2 模型.**
- 类型: 新功能
- 提交: RVC-Boss
- 2024.08.03 [Commit#8a101474](https://github.com/RVC-Boss/GPT-SoVITS/commit/8a101474b5a4f913b4c94fca2e3ca87d0771bae3)
- 内容: 增加粤语 FunASR 支持.
- 类型: 新功能
- 提交: RVC-Boss
- 2024.08.03 [PR#1387](https://github.com/RVC-Boss/GPT-SoVITS/pull/1387), [PR#1388](https://github.com/RVC-Boss/GPT-SoVITS/pull/1388)
- 内容: 优化界面, 优化计时逻辑.
- 类型: 杂项
- 提交: XXXXRT666
- 2024.08.06 [PR#1404](https://github.com/RVC-Boss/GPT-SoVITS/pull/1404), [PR#987](https://github.com/RVC-Boss/GPT-SoVITS/pull/987), [PR#488](https://github.com/RVC-Boss/GPT-SoVITS/pull/488)
- 内容: 优化多音字逻辑 (V2 版本特供).
- 类型: 修复, 新功能
- 提交: KamioRinn, RVC-Boss
- 2024.08.13 [PR#1422](https://github.com/RVC-Boss/GPT-SoVITS/pull/1422)
- 内容: 修复参考音频混合只能上传一条的错误, 添加数据集检查, 缺失会弹出警告窗口.
- 类型: 修复, 杂项
- 提交: XXXXRT666
- 2024.08.20 [Issue#1508](https://github.com/RVC-Boss/GPT-SoVITS/issues/1508)
- 内容: 上游 LangSegment 库支持通过 SSML 标签优化数字、电话、时间日期等.
- 类型: 新功能
- 提交: juntaosun
- 2024.08.20 [PR#1503](https://github.com/RVC-Boss/GPT-SoVITS/pull/1503)
- 内容: 修复并优化 API.
- 类型: 修复
- 提交: KamioRinn
- 2024.08.20 [PR#1490](https://github.com/RVC-Boss/GPT-SoVITS/pull/1490)
- 内容: 合并 fast_inference 分支.
- 类型: 重构
- 提交: ChasonJiang
- 2024.08.21 **正式发布 GPT-SoVITS V2 版本.**
## 202502 (V3 版本)
- 2025.02.11 [Commit#ed207c4b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ed207c4b879d5296e9be3ae5f7b876729a2c43b8)~[Commit#6e2b4918](https://github.com/RVC-Boss/GPT-SoVITS/commit/6e2b49186c5b961f0de41ea485d398dffa9787b4)
- 内容: **新增 GPT-SoVITS V3 模型, 需要 14G 显存进行微调.**
- 类型: 新功能 (特性参阅 [Wiki](https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)))
- 提交: RVC-Boss
- 2025.02.12 [PR#2032](https://github.com/RVC-Boss/GPT-SoVITS/pull/2032)
- 内容: 更新项目多语言文档.
- 类型: 文档
- 提交: StaryLan
- 2025.02.12 [PR#2033](https://github.com/RVC-Boss/GPT-SoVITS/pull/2033)
- 内容: 更新日语文档.
- 类型: 文档
- 提交: Fyphen
- 2025.02.12 [PR#2010](https://github.com/RVC-Boss/GPT-SoVITS/pull/2010)
- 内容: 优化注意力计算逻辑.
- 类型: 性能优化
- 提交: wzy3650
- 2025.02.12 [PR#2040](https://github.com/RVC-Boss/GPT-SoVITS/pull/2040)
- 内容: 微调添加梯度检查点支持, 需要 12G 显存进行微调.
- 类型: 新功能
- 提交: Kakaru Hayate
- 2025.02.14 [PR#2047](https://github.com/RVC-Boss/GPT-SoVITS/pull/2047), [PR#2062](https://github.com/RVC-Boss/GPT-SoVITS/pull/2062), [PR#2073](https://github.com/RVC-Boss/GPT-SoVITS/pull/2073)
- 内容: 切换新的语言分割工具, 优化多语种混合文本切分策略, 优化文本里的数字和英文处理逻辑.
- 类型: 新功能
- 提交: KamioRinn
- 2025.02.23 [Commit#56509a17](https://github.com/RVC-Boss/GPT-SoVITS/commit/56509a17c918c8d149c48413a672b8ddf437495b)~[Commit#514fb692](https://github.com/RVC-Boss/GPT-SoVITS/commit/514fb692db056a06ed012bc3a5bca2a5b455703e)
- 内容: **GPT-SoVITS V3 模型支持 LoRA 训练, 需要 8G 显存进行微调.**
- 类型: 新功能
- 提交: RVC-Boss
- 2025.02.23 [PR#2078](https://github.com/RVC-Boss/GPT-SoVITS/pull/2078)
- 内容: 人声背景音分离增加 Mel Band Roformer 模型支持.
- 类型: 新功能
- 提交: Sucial
- 2025.02.26 [PR#2112](https://github.com/RVC-Boss/GPT-SoVITS/pull/2112), [PR#2114](https://github.com/RVC-Boss/GPT-SoVITS/pull/2114)
- 内容: 修复中文路径下 Mecab 的报错 (具体表现为日文韩文、文本混合语种切分可能会遇到的报错).
- 类型: 修复
- 提交: KamioRinn
- 2025.02.27 [Commit#92961c3f](https://github.com/RVC-Boss/GPT-SoVITS/commit/92961c3f68b96009ff2cd00ce614a11b6c4d026f)~[Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/250b1c73cba60db18148b21ec5fbce01fd9d19bc)
- 内容: **支持使用 24KHz 转 48kHz 的音频超分模型**, 缓解 V3 模型生成音频感觉闷的问题.
- 类型: 新功能
- 提交: RVC-Boss
- 关联: [Issue#2085](https://github.com/RVC-Boss/GPT-SoVITS/issues/2085), [Issue#2117](https://github.com/RVC-Boss/GPT-SoVITS/issues/2117)
- 2025.02.28 [PR#2123](https://github.com/RVC-Boss/GPT-SoVITS/pull/2123)
- 内容: 更新项目多语言文档
- 类型: 文档
- 提交: StaryLan
- 2025.02.28 [PR#2122](https://github.com/RVC-Boss/GPT-SoVITS/pull/2122)
- 内容: 对于模型无法判断的CJK短字符采用规则判断.
- 类型: 修复
- 提交: KamioRinn
- 关联: [Issue#2116](https://github.com/RVC-Boss/GPT-SoVITS/issues/2116)
- 2025.02.28 [Commit#c38b1690](https://github.com/RVC-Boss/GPT-SoVITS/commit/c38b16901978c1db79491e16905ea3a37a7cf686), [Commit#a32a2b89](https://github.com/RVC-Boss/GPT-SoVITS/commit/a32a2b893436fad56cc82409121c7fa36a1815d5)
- 内容: 增加语速传参以支持调整合成语速.
- 类型: 修复
- 提交: RVC-Boss
- 2025.02.28 **正式发布 GPT-SoVITS V3**.
## 202503
- 2025.03.31 [PR#2236](https://github.com/RVC-Boss/GPT-SoVITS/pull/2236)
- 内容: 修复一批由依赖的库版本不对导致的问题.
- 类型: 修复
- 提交: XXXXRT666
- 关联
- PyOpenJTalk: [Issue#1131](https://github.com/RVC-Boss/GPT-SoVITS/issues/1131), [Issue#2231](https://github.com/RVC-Boss/GPT-SoVITS/issues/2231), [Issue#2233](https://github.com/RVC-Boss/GPT-SoVITS/issues/2233).
- ONNX: [Issue#492](https://github.com/RVC-Boss/GPT-SoVITS/issues/492), [Issue#671](https://github.com/RVC-Boss/GPT-SoVITS/issues/671), [Issue#1192](https://github.com/RVC-Boss/GPT-SoVITS/issues/1192), [Issue#1819](https://github.com/RVC-Boss/GPT-SoVITS/issues/1819), [Issue#1841](https://github.com/RVC-Boss/GPT-SoVITS/issues/1841).
- Pydantic: [Issue#2230](https://github.com/RVC-Boss/GPT-SoVITS/issues/2230), [Issue#2239](https://github.com/RVC-Boss/GPT-SoVITS/issues/2239).
- PyTorch-Lightning: [Issue#2174](https://github.com/RVC-Boss/GPT-SoVITS/issues/2174).
- 2025.03.31 [PR#2241](https://github.com/RVC-Boss/GPT-SoVITS/pull/2241)
- 内容: **为 SoVITS v3 适配并行推理**.
- 类型: 新功能
- 提交: ChasonJiang
- 修复其他若干错误.
- 整合包修复 onnxruntime GPU 推理的支持
- 类型: 修复
- 内容
- G2PW 内的 ONNX 模型由 CPU 推理 换为 GPU, 显著降低推理的 CPU 瓶颈;
- foxjoy 去混响模型现在可使用 GPU 推理
## 202504 (V4 版本)
- 2025.04.01 [Commit#6a60e5ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/6a60e5edb1817af4a61c7a5b196c0d0f1407668f)
- 内容: 解锁 SoVITS v3 并行推理, 修复模型加载异步逻辑.
- 类型: 修复
- 提交: RVC-Boss
- 2025.04.07 [PR#2255](https://github.com/RVC-Boss/GPT-SoVITS/pull/2255)
- 内容: Ruff 格式化代码, 更新 G2PW 链接.
- 类型: 风格
- 提交: XXXXRT666
- 2025.04.15 [PR#2290](https://github.com/RVC-Boss/GPT-SoVITS/pull/2290)
- 内容: 清理文档, 支持 Python 3.11, 更新安装文件.
- 类型: 杂项
- 提交: XXXXRT666
- 2025.04.20 [PR#2300](https://github.com/RVC-Boss/GPT-SoVITS/pull/2300)
- 内容: 更新 Colab, 安装文件和模型下载.
- 类型: 杂项
- 提交: XXXXRT666
- 2025.04.20 [Commit#e0c452f0](https://github.com/RVC-Boss/GPT-SoVITS/commit/e0c452f0078e8f7eb560b79a54d75573fefa8355)~[Commit#9d481da6](https://github.com/RVC-Boss/GPT-SoVITS/commit/9d481da610aa4b0ef8abf5651fd62800d2b4e8bf)
- 内容: **新增 GPT-SoVITS V4 模型**.
- 类型: 新功能
- 提交: RVC-Boss
- 2025.04.21 [Commit#8b394a15](https://github.com/RVC-Boss/GPT-SoVITS/commit/8b394a15bce8e1d85c0b11172442dbe7a6017ca2)~[Commit#bc2fe5ec](https://github.com/RVC-Boss/GPT-SoVITS/commit/bc2fe5ec86536c77bb3794b4be263ac87e4fdae6), [PR#2307](https://github.com/RVC-Boss/GPT-SoVITS/pull/2307)
- 内容: 适配 V4 并行推理.
- 类型: 新功能
- 提交: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#7405427a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7405427a0ab2a43af63205df401fd6607a408d87)~[Commit#590c83d7](https://github.com/RVC-Boss/GPT-SoVITS/commit/590c83d7667c8d4908f5bdaf2f4c1ba8959d29ff), [PR#2309](https://github.com/RVC-Boss/GPT-SoVITS/pull/2309)
- 内容: 修复模型版本传参.
- 类型: 修复
- 提交: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#fbdab94e](https://github.com/RVC-Boss/GPT-SoVITS/commit/fbdab94e17d605d85841af6f94f40a45976dd1d9), [PR#2310](https://github.com/RVC-Boss/GPT-SoVITS/pull/2310)
- 内容: 修复 Numpy 与 Numba 版本不匹配问题, 更新 librosa 版本.
- 类型: 修复
- 提交: RVC-Boss, XXXXRT666
- 关联: [Issue#2308](https://github.com/RVC-Boss/GPT-SoVITS/issues/2308)
- **2024.04.22 正式发布 GPT-SoVITS V4**.
- 2025.04.22 [PR#2311](https://github.com/RVC-Boss/GPT-SoVITS/pull/2311)
- 内容: 更新 Gradio 参数.
- 类型: 杂项
- 提交: XXXXRT666
- 2025.04.25 [PR#2322](https://github.com/RVC-Boss/GPT-SoVITS/pull/2322)
- 内容: 完善 Colab/Kaggle Notebook 脚本.
- 类型: 杂项
- 提交: XXXXRT666
## 202505
- 2025.05.26 [PR#2351](https://github.com/RVC-Boss/GPT-SoVITS/pull/2351)
- 内容: 完善 Docker, Windows 自动构建脚本, Pre-Commit 格式化.
- 类型: 杂项
- 提交: XXXXRT666
- 2025.05.26 [PR#2408](https://github.com/RVC-Boss/GPT-SoVITS/pull/2408)
- 内容: 优化混合语种切分识别逻辑.
- 类型: 修复
- 提交: KamioRinn
- 关联: [Issue#2404](https://github.com/RVC-Boss/GPT-SoVITS/issues/2404)
- 2025.05.26 [PR#2377](https://github.com/RVC-Boss/GPT-SoVITS/pull/2377)
- 内容: 通过缓存策略使 SoVITS V3/V4 推理提速 10%.
- 类型: 性能优化
- 提交: Kakaru Hayate
- 2025.05.26 [Commit#4d9d56b1](https://github.com/RVC-Boss/GPT-SoVITS/commit/4d9d56b19638dc434d6eefd9545e4d8639a3e072), [Commit#8c705784](https://github.com/RVC-Boss/GPT-SoVITS/commit/8c705784c50bf438c7b6d0be33a9e5e3cb90e6b2), [Commit#fafe4e7f](https://github.com/RVC-Boss/GPT-SoVITS/commit/fafe4e7f120fba56c5f053c6db30aa675d5951ba)
- 内容: 更新标注界面, 增加友情提示, 即标注完每一页都要点击 `Submit Text` 否则修改无效.
- 类型: 修复
- 提交: RVC-Boss
- 2025.05.29 [Commit#1934fc1e](https://github.com/RVC-Boss/GPT-SoVITS/commit/1934fc1e1b22c4c162bba1bbe7d7ebb132944cdc)
- 内容: 修复 UVR5 和 ONNX 去混响模型使用 FFmpeg 编码 MP3 和 M4A 原路径带空格时的错误.
- 类型: 修复
- 提交: RVC-Boss
## 202506 (V2Pro 系列)
- 2025.06.03 [PR#2420](https://github.com/RVC-Boss/GPT-SoVITS/pull/2420)
- 内容: 更新项目多语言文档.
- 类型: 文档
- 提交: StaryLan
- 2025.06.04 [PR#2417](https://github.com/RVC-Boss/GPT-SoVITS/pull/2417)
- 内容: 支持 torchscript 导出 V4 模型.
- 类型: 新功能
- 提交: L-jasmine
- 2025.06.04 [Commit#b7c0c5ca](https://github.com/RVC-Boss/GPT-SoVITS/commit/b7c0c5ca878bcdd419fd86bf80dba431a6653356)~[Commit#298ebb03](https://github.com/RVC-Boss/GPT-SoVITS/commit/298ebb03c5a719388527ae6a586c7ea960344e70)
- 内容: **新增 GPT-SoVITS V2Pro 系列模型**.
- 类型: 新功能
- 提交: RVC-Boss
- 2025.06.05 [PR#2426](https://github.com/RVC-Boss/GPT-SoVITS/pull/2426)
- 内容: `config/inference_webui` 初始化错误修复.
- 类型: 修复
- 提交: StaryLan
- 2025.06.05 [PR#2427](https://github.com/RVC-Boss/GPT-SoVITS/pull/2427), [Commit#7d70852a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7d70852a3f67c3b52e3a62857f8663d529efc8cd), [PR#2434](https://github.com/RVC-Boss/GPT-SoVITS/pull/2434)
- 内容: 优化精度自动检测逻辑, 给 WebUI 前端界面模块增加折叠功能.
- 类型: 新功能
- 提交: XXXXRT666, RVC-Boss
- 2025.06.06 [PR#2427](https://github.com/RVC-Boss/GPT-SoVITS/pull/2427)
- 内容: X一X型多音字判断修复
- 类型: 修复
- 提交: wzy3650
- 2025.06.05 [PR#2439](https://github.com/RVC-Boss/GPT-SoVITS/pull/2439)
- 内容: 配置修复sovits模型读取修复
- 类型: 修复
- 提交: wzy3650
- 2025.06.09 [Commit#8056efe4](https://github.com/RVC-Boss/GPT-SoVITS/commit/8056efe4ab7bbc3610c72ae356a6f37518441f7d)
- 内容: 修复ge.sum数值可能爆炸导致推理无声的问题
- 类型: 修复
- 提交: RVC-Boss
- 2025.06.10 [Commit#2c0436b9](https://github.com/RVC-Boss/GPT-SoVITS/commit/2c0436b9ce397424ae03476c836fb64c6e5ebcc6)
- 内容: 修复实验名结尾出现空格在win中路径不正确的问题
- 类型: 修复
- 提交: RVC-Boss
- 2025.06.10 [Commit#746cb536](https://github.com/RVC-Boss/GPT-SoVITS/commit/746cb536c68b1fe6ce3ca7e882235375b8a8dd89)
- 内容: 语种分割优化
- 类型: 优化
- 提交: KamioRinn
- 2025.06.11 [Commit#dd2b9253](https://github.com/RVC-Boss/GPT-SoVITS/commit/dd2b9253aabb09db32db7a3344570ed9df043351)
- 内容: 修复并行推理对v2pro支持bug
- 类型: 修复
- 提交: YYuX-1145
- 2025.06.11 [Commit#ed89a023](https://github.com/RVC-Boss/GPT-SoVITS/commit/ed89a023378dabba9d4b6580235bb9742245816d)
- 内容: v2pro对ge提取时会出现数值溢出的问题修复
- 类型: 修复
- 提交: RVC-Boss
- 2025.06.11 [Commit#37f5abfc](https://github.com/RVC-Boss/GPT-SoVITS/commit/6fdc67ca83418306f11e90b9139278313ac5c3e9)[Commit#6fdc67ca](https://github.com/RVC-Boss/GPT-SoVITS/commit/37f5abfcb4a6553652235909db2e124b6f8ff3a5)
- 内容: install.sh逻辑优化
- 类型: 优化
- 提交: XXXXRT666
- 2025.06.27 [Commit#90ebefa7](https://github.com/RVC-Boss/GPT-SoVITS/commit/90ebefa78fd544da36eebe0b2003620879c921b0)
- 内容: onnxruntime加载逻辑优化对gpu/cpu的判断
- 类型: 优化
- 提交: KamioRinn
- 2025.06.27 [Commit#6df61f58](https://github.com/RVC-Boss/GPT-SoVITS/commit/6df61f58e4d18d4c2ad9d1eddd6a1bd690034c23)
- 内容: 语言分割及格式化优化
- 类型: 优化
- 提交: KamioRinn
- 2025.07.10 [Commit#426e1a2bb](https://github.com/RVC-Boss/GPT-SoVITS/commit/426e1a2bb43614af2479b877c37acfb0591e952f)
- 内容: 提升推理进程优先级修复win11下可能GPU利用率受限的问题
- 类型: 修复
- 提交: XianYue0125

View File

@@ -0,0 +1,466 @@
<div align="center">
<h1>GPT-SoVITS-WebUI</h1>
强大的少样本语音转换与语音合成Web用户界面.<br><br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/RVC-Boss/GPT-SoVITS)
<a href="https://trendshift.io/repositories/7033" target="_blank"><img src="https://trendshift.io/api/badge/repositories/7033" alt="RVC-Boss%2FGPT-SoVITS | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
[![Python](https://img.shields.io/badge/python-3.10--3.12-blue?style=for-the-badge&logo=python)](https://www.python.org)
[![GitHub release](https://img.shields.io/github/v/release/RVC-Boss/gpt-sovits?style=for-the-badge&logo=github)](https://github.com/RVC-Boss/gpt-sovits/releases)
[![Train In Colab](https://img.shields.io/badge/Colab-Training-F9AB00?style=for-the-badge&logo=googlecolab)](https://colab.research.google.com/github/RVC-Boss/GPT-SoVITS/blob/main/Colab-WebUI.ipynb)
[![Huggingface](https://img.shields.io/badge/免费在线体验-free_online_demo-yellow.svg?style=for-the-badge&logo=huggingface)](https://lj1995-gpt-sovits-proplus.hf.space/)
[![Image Size](https://img.shields.io/docker/image-size/xxxxrt666/gpt-sovits/latest?style=for-the-badge&logo=docker)](https://hub.docker.com/r/xxxxrt666/gpt-sovits)
[![简体中文](https://img.shields.io/badge/简体中文-阅读文档-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e)
[![English](https://img.shields.io/badge/English-Read%20Docs-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://rentry.co/GPT-SoVITS-guide#/)
[![Change Log](https://img.shields.io/badge/Change%20Log-View%20Updates-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/docs/en/Changelog_EN.md)
[![License](https://img.shields.io/badge/LICENSE-MIT-green.svg?style=for-the-badge&logo=opensourceinitiative)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/LICENSE)
[**English**](../../README.md) | **中文简体** | [**日本語**](../ja/README.md) | [**한국어**](../ko/README.md) | [**Türkçe**](../tr/README.md)
</div>
---
## 功能
1. **零样本文本到语音 (TTS):** 输入 5 秒的声音样本, 即刻体验文本到语音转换.
2. **少样本 TTS:** 仅需 1 分钟的训练数据即可微调模型, 提升声音相似度和真实感.
3. **跨语言支持:** 支持与训练数据集不同语言的推理, 目前支持英语、日语、韩语、粤语和中文.
4. **WebUI 工具:** 集成工具包括声音伴奏分离、自动训练集分割、中文自动语音识别(ASR)和文本标注, 协助初学者创建训练数据集和 GPT/SoVITS 模型.
**查看我们的介绍视频 [demo video](https://www.bilibili.com/video/BV12g4y1m7Uw)**
未见过的说话者 few-shot 微调演示:
<https://github.com/RVC-Boss/GPT-SoVITS/assets/129054828/05bee1fa-bdd8-4d85-9350-80c060ab47fb>
**用户手册: [简体中文](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e) | [English](https://rentry.co/GPT-SoVITS-guide#/)**
## 安装
中国地区的用户可[点击此处](https://www.codewithgpu.com/i/RVC-Boss/GPT-SoVITS/GPT-SoVITS-Official)使用 AutoDL 云端镜像进行体验.
### 测试通过的环境
| Python Version | PyTorch Version | Device |
| -------------- | ---------------- | ------------- |
| Python 3.10 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.7.0 | CUDA 12.8 |
| Python 3.9 | PyTorch 2.8.0dev | CUDA 12.8 |
| Python 3.9 | PyTorch 2.5.1 | Apple silicon |
| Python 3.11 | PyTorch 2.7.0 | Apple silicon |
| Python 3.9 | PyTorch 2.2.2 | CPU |
### Windows
如果你是 Windows 用户 (已在 win>=10 上测试), 可以下载[整合包](https://huggingface.co/lj1995/GPT-SoVITS-windows-package/resolve/main/GPT-SoVITS-v3lora-20250228.7z?download=true), 解压后双击 go-webui.bat 即可启动 GPT-SoVITS-WebUI.
**中国地区的用户可以[在此处下载整合包](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e/dkxgpiy9zb96hob4#KTvnO).**
```pwsh
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pwsh -F install.ps1 --Device <CU126|CU128|CPU> --Source <HF|HF-Mirror|ModelScope> [--DownloadUVR5]
```
### Linux
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <CU126|CU128|ROCM|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### macOS
**注: 在 Mac 上使用 GPU 训练的模型效果显著低于其他设备训练的模型, 所以我们暂时使用 CPU 进行训练.**
运行以下的命令来安装本项目:
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <MPS|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### 手动安装
#### 安装依赖
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pip install -r extra-req.txt --no-deps
pip install -r requirements.txt
```
#### 安装 FFmpeg
##### Conda 用户
```bash
conda activate GPTSoVits
conda install ffmpeg
```
##### Ubuntu/Debian 用户
```bash
sudo apt install ffmpeg
sudo apt install libsox-dev
```
##### Windows 用户
下载并将 [ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe) 和 [ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe) 放置在 GPT-SoVITS 根目录下
安装 [Visual Studio 2017](https://aka.ms/vs/17/release/vc_redist.x86.exe) 环境
##### MacOS 用户
```bash
brew install ffmpeg
```
### 运行 GPT-SoVITS (使用 Docker)
#### Docker 镜像选择
由于代码库更新频繁, 而 Docker 镜像的发布周期相对较慢, 请注意:
- 前往 [Docker Hub](https://hub.docker.com/r/xxxxrt666/gpt-sovits) 查看最新可用的镜像标签(tags)
- 根据你的运行环境选择合适的镜像标签
- `Lite` Docker 镜像**不包含** ASR 模型和 UVR5 模型. 你可以自行下载 UVR5 模型, ASR 模型则会在需要时由程序自动下载
- 在使用 Docker Compose 时, 会自动拉取适配的架构镜像 (amd64 或 arm64)
- Docker Compose 将会挂载当前目录的**所有文件**, 请在使用 Docker 镜像前先切换到项目根目录并**拉取代码更新**
- 可选:为了获得最新的更改, 你可以使用提供的 Dockerfile 在本地构建镜像
#### 环境变量
- `is_half`:控制是否启用半精度(fp16). 如果你的 GPU 支持, 设置为 `true` 可以减少显存占用
#### 共享内存配置
在 Windows (Docker Desktop) 中, 默认共享内存大小较小, 可能导致运行异常. 请在 Docker Compose 文件中根据系统内存情况, 增大 `shm_size` (例如设置为 `16g`)
#### 选择服务
`docker-compose.yaml` 文件定义了两个主要服务类型:
- `GPT-SoVITS-CU126``GPT-SoVITS-CU128`:完整版, 包含所有功能
- `GPT-SoVITS-CU126-Lite``GPT-SoVITS-CU128-Lite`:轻量版, 依赖更少, 功能略有删减
如需使用 Docker Compose 运行指定服务, 请执行:
```bash
docker compose run --service-ports <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128>
```
#### 本地构建 Docker 镜像
如果你希望自行构建镜像, 请使用以下命令:
```bash
bash docker_build.sh --cuda <12.6|12.8> [--lite]
```
#### 访问运行中的容器 (Bash Shell)
当容器在后台运行时, 你可以通过以下命令进入容器:
```bash
docker exec -it <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128> bash
```
## 预训练模型
**若成功运行`install.sh`可跳过 No.1,2,3**
**中国地区的用户可以[在此处下载这些模型](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e/dkxgpiy9zb96hob4#nVNhX).**
1. 从 [GPT-SoVITS Models](https://huggingface.co/lj1995/GPT-SoVITS) 下载预训练模型, 并将其放置在 `GPT_SoVITS/pretrained_models` 目录中.
2. 从 [G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) 下载模型, 解压并重命名为 `G2PWModel`, 然后将其放置在 `GPT_SoVITS/text` 目录中. (仅限中文 TTS)
3. 对于 UVR5 (人声/伴奏分离和混响移除, 额外功能), 从 [UVR5 Weights](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/uvr5_weights) 下载模型, 并将其放置在 `tools/uvr5/uvr5_weights` 目录中.
- 如果你在 UVR5 中使用 `bs_roformer``mel_band_roformer`模型, 你可以手动下载模型和相应的配置文件, 并将它们放在 `tools/UVR5/UVR5_weights` 中.**重命名模型文件和配置文件, 确保除后缀外**, 模型和配置文件具有相同且对应的名称.此外, 模型和配置文件名**必须包含"roformer"**, 才能被识别为 roformer 类的模型.
- 建议在模型名称和配置文件名中**直接指定模型类型**, 例如`mel_mand_roformer``bs_roformer`.如果未指定, 将从配置文中比对特征, 以确定它是哪种类型的模型.例如, 模型`bs_roformer_ep_368_sdr_12.9628.ckpt` 和对应的配置文件`bs_roformer_ep_368_sdr_12.9628.yaml` 是一对.`kim_mel_band_roformer.ckpt``kim_mel_band_roformer.yaml` 也是一对.
4. 对于中文 ASR (额外功能), 从 [Damo ASR Model](https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/files)、[Damo VAD Model](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/files) 和 [Damo Punc Model](https://modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/files) 下载模型, 并将它们放置在 `tools/asr/models` 目录中.
5. 对于英语或日语 ASR (额外功能), 从 [Faster Whisper Large V3](https://huggingface.co/Systran/faster-whisper-large-v3) 下载模型, 并将其放置在 `tools/asr/models` 目录中.此外, [其他模型](https://huggingface.co/Systran) 可能具有类似效果且占用更少的磁盘空间.
## 数据集格式
文本到语音 (TTS) 注释 .list 文件格式:
```
vocal_path|speaker_name|language|text
```
语言字典:
- 'zh': 中文
- 'ja': 日语
- 'en': 英语
- 'ko': 韩语
- 'yue': 粤语
示例:
```
D:\GPT-SoVITS\xxx/xxx.wav|xxx|zh|我爱玩原神.
```
## 微调与推理
### 打开 WebUI
#### 整合包用户
双击`go-webui.bat`或者使用`go-webui.ps1`
若想使用 V1,则双击`go-webui-v1.bat`或者使用`go-webui-v1.ps1`
#### 其他
```bash
python webui.py <language(optional)>
```
若想使用 V1,则
```bash
python webui.py v1 <language(optional)>
```
或者在 webUI 内动态切换
### 微调
#### 现已支持自动填充路径
1. 填入训练音频路径
2. 切割音频
3. 进行降噪(可选)
4. 进行 ASR
5. 校对标注
6. 前往下一个窗口,点击训练
### 打开推理 WebUI
#### 整合包用户
双击 `go-webui.bat` 或者使用 `go-webui.ps1` ,然后在 `1-GPT-SoVITS-TTS/1C-推理` 中打开推理 webUI
#### 其他
```bash
python GPT_SoVITS/inference_webui.py <language(optional)>
```
或者
```bash
python webui.py
```
然后在 `1-GPT-SoVITS-TTS/1C-推理` 中打开推理 webUI
## V2 发布说明
新特性:
1. 支持韩语及粤语
2. 更好的文本前端
3. 底模由 2k 小时扩展至 5k 小时
4. 对低音质参考音频 (尤其是来源于网络的高频严重缺失、听着很闷的音频) 合成出来音质更好
详见[wiki](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v2%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
从 v1 环境迁移至 v2
1. 需要 pip 安装 requirements.txt 更新环境
2. 需要克隆 github 上的最新代码
3. 需要从[huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main/gsv-v2final-pretrained) 下载预训练模型文件放到 GPT_SoVITS/pretrained_models/gsv-v2final-pretrained 下
中文额外需要下载[G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) (下载 G2PW 模型,解压并重命名为`G2PWModel`,将其放到`GPT_SoVITS/text`目录下)
## V3 更新说明
新模型特点:
1. 音色相似度更像, 需要更少训练集来逼近本人 (不训练直接使用底模模式下音色相似性提升更大)
2. GPT 合成更稳定, 重复漏字更少, 也更容易跑出丰富情感
详见[wiki](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v2%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
从 v2 环境迁移至 v3
1. 需要 pip 安装 requirements.txt 更新环境
2. 需要克隆 github 上的最新代码
3. 从[huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main)下载这些 v3 新增预训练模型 (s1v3.ckpt, s2Gv3.pth and models--nvidia--bigvgan_v2_24khz_100band_256x folder)将他们放到`GPT_SoVITS/pretrained_models`目录下
如果想用音频超分功能缓解 v3 模型生成 24k 音频觉得闷的问题, 需要下载额外的模型参数, 参考[how to download](../../tools/AP_BWE_main/24kto48k/readme.txt)
## V4 更新说明
新特性:
1. **V4 版本修复了 V3 版本中由于非整数倍上采样导致的金属音问题, 并原生输出 48kHz 音频以避免声音闷糊 (而 V3 版本仅原生输出 24kHz 音频)**. 作者认为 V4 是对 V3 的直接替代, 但仍需进一步测试.
[更多详情](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3v4%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
从 V1/V2/V3 环境迁移至 V4
1. 执行 `pip install -r requirements.txt` 更新部分依赖包.
2. 从 GitHub 克隆最新代码.
3. 从 [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) 下载 V4 预训练模型 (`gsv-v4-pretrained/s2v4.ckpt``gsv-v4-pretrained/vocoder.pth`), 并放入 `GPT_SoVITS/pretrained_models` 目录.
## V2Pro 更新说明
新特性:
1. **相比 V2 占用稍高显存, 性能超过 V4, 在保留 V2 硬件成本和推理速度优势的同时实现更高音质.**
[更多详情](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90features-(%E5%90%84%E7%89%88%E6%9C%AC%E7%89%B9%E6%80%A7)>)
2. V1/V2 与 V2Pro 系列具有相同特性, V3/V4 则具备相近功能. 对于平均音频质量较低的训练集, V1/V2/V2Pro 可以取得较好的效果, 但 V3/V4 无法做到. 此外, V3/V4 合成的声音更偏向参考音频, 而不是整体训练集的风格.
从 V1/V2/V3/V4 环境迁移至 V2Pro
1. 执行 `pip install -r requirements.txt` 更新部分依赖包.
2. 从 GitHub 克隆最新代码.
3. 从 [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) 下载 V2Pro 预训练模型 (`v2Pro/s2Dv2Pro.pth`, `v2Pro/s2Gv2Pro.pth`, `v2Pro/s2Dv2ProPlus.pth`, `v2Pro/s2Gv2ProPlus.pth`, 和 `sv/pretrained_eres2netv2w24s4ep4.ckpt`), 并放入 `GPT_SoVITS/pretrained_models` 目录.
## 待办事项清单
- [x] **高优先级:**
- [x] 日语和英语的本地化.
- [x] 用户指南.
- [x] 日语和英语数据集微调训练.
- [ ] **功能:**
- [x] 零样本声音转换 (5 秒) / 少样本声音转换 (1 分钟).
- [x] TTS 语速控制.
- [ ] ~~增强的 TTS 情感控制.~~
- [ ] 尝试将 SoVITS 令牌输入更改为词汇的概率分布.
- [x] 改进英语和日语文本前端.
- [ ] 开发体积小和更大的 TTS 模型.
- [x] Colab 脚本.
- [x] 扩展训练数据集 (从 2k 小时到 10k 小时).
- [x] 更好的 sovits 基础模型 (增强的音频质量).
- [ ] 模型混合.
## (附加) 命令行运行方式
使用命令行打开 UVR5 的 WebUI
```bash
python tools/uvr5/webui.py "<infer_device>" <is_half> <webui_port_uvr5>
```
<!-- 如果打不开浏览器, 请按照下面的格式进行UVR处理, 这是使用mdxnet进行音频处理的方式
````
python mdxnet.py --model --input_root --output_vocal --output_ins --agg_level --format --device --is_half_precision
```` -->
这是使用命令行完成数据集的音频切分的方式
```bash
python audio_slicer.py \
--input_path "<path_to_original_audio_file_or_directory>" \
--output_root "<directory_where_subdivided_audio_clips_will_be_saved>" \
--threshold <volume_threshold> \
--min_length <minimum_duration_of_each_subclip> \
--min_interval <shortest_time_gap_between_adjacent_subclips>
--hop_size <step_size_for_computing_volume_curve>
```
这是使用命令行完成数据集 ASR 处理的方式 (仅限中文)
```bash
python tools/asr/funasr_asr.py -i <input> -o <output>
```
通过 Faster_Whisper 进行 ASR 处理 (除中文之外的 ASR 标记)
(没有进度条, GPU 性能可能会导致时间延迟)
```bash
python ./tools/asr/fasterwhisper_asr.py -i <input> -o <output> -l <language> -p <precision>
```
启用自定义列表保存路径
## 致谢
特别感谢以下项目和贡献者:
### 理论研究
- [ar-vits](https://github.com/innnky/ar-vits)
- [SoundStorm](https://github.com/yangdongchao/SoundStorm/tree/master/soundstorm/s1/AR)
- [vits](https://github.com/jaywalnut310/vits)
- [TransferTTS](https://github.com/hcy71o/TransferTTS/blob/master/models.py#L556)
- [contentvec](https://github.com/auspicious3000/contentvec/)
- [hifi-gan](https://github.com/jik876/hifi-gan)
- [fish-speech](https://github.com/fishaudio/fish-speech/blob/main/tools/llama/generate.py#L41)
- [f5-TTS](https://github.com/SWivid/F5-TTS/blob/main/src/f5_tts/model/backbones/dit.py)
- [shortcut flow matching](https://github.com/kvfrans/shortcut-models/blob/main/targets_shortcut.py)
### 预训练模型
- [Chinese Speech Pretrain](https://github.com/TencentGameMate/chinese_speech_pretrain)
- [Chinese-Roberta-WWM-Ext-Large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large)
- [BigVGAN](https://github.com/NVIDIA/BigVGAN)
- [eresnetv2](https://modelscope.cn/models/iic/speech_eres2netv2w24s4ep4_sv_zh-cn_16k-common)
### 推理用文本前端
- [paddlespeech zh_normalization](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/zh_normalization)
- [split-lang](https://github.com/DoodleBears/split-lang)
- [g2pW](https://github.com/GitYCC/g2pW)
- [pypinyin-g2pW](https://github.com/mozillazg/pypinyin-g2pW)
- [paddlespeech g2pw](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/g2pw)
### WebUI 工具
- [ultimatevocalremovergui](https://github.com/Anjok07/ultimatevocalremovergui)
- [audio-slicer](https://github.com/openvpi/audio-slicer)
- [SubFix](https://github.com/cronrpc/SubFix)
- [FFmpeg](https://github.com/FFmpeg/FFmpeg)
- [gradio](https://github.com/gradio-app/gradio)
- [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
- [FunASR](https://github.com/alibaba-damo-academy/FunASR)
- [AP-BWE](https://github.com/yxlu-0102/AP-BWE)
感谢 @Naozumi520 提供粤语训练集, 并在粤语相关知识方面给予指导.
## 感谢所有贡献者的努力
<a href="https://github.com/RVC-Boss/GPT-SoVITS/graphs/contributors" target="_blank">
<img src="https://contrib.rocks/image?repo=RVC-Boss/GPT-SoVITS" />
</a>

View File

@@ -0,0 +1,580 @@
# Changelog
## 202401
- 2024.01.21 [PR#108](https://github.com/RVC-Boss/GPT-SoVITS/pull/108)
- Content: Added English system translation support to WebUI.
- Type: Documentation
- Contributor: D3lik
- 2024.01.21 [Commit#7b89c9ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b89c9ed5669f63c4ed6ae791408969640bdcf3e)
- Content: Attempted to fix SoVITS training ZeroDivisionError.
- Type: Fix
- Contributor: RVC-Boss, Tybost
- Related: [Issue#79](https://github.com/RVC-Boss/GPT-SoVITS/issues/79)
- 2024.01.21 [Commit#ea62d6e0](https://github.com/RVC-Boss/GPT-SoVITS/commit/ea62d6e0cf1efd75287766ea2b55d1c3b69b4fd3)
- Content: Significantly reduced the issue of synthesized audio containing the end of the reference audio.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.01.21 [Commit#a87ad522](https://github.com/RVC-Boss/GPT-SoVITS/commit/a87ad5228ed2d729da42019ae1b93171f6a745ef)
- Content: `cmd-asr.py` now checks if the FunASR model is included in the default directory, and if not, it will download it from ModelScope.
- Type: Feature
- Contributor: RVC-Boss
- 2024.01.21 [Commit#f6147116](https://github.com/RVC-Boss/GPT-SoVITS/commit/f61471166c107ba56ccb7a5137fa9d7c09b2830d)
- Content: `Config.py` now has an `is_share` parameter, which can be set to `True` to map the WebUI to the public network.
- Type: Feature
- Contributor: RVC-Boss
- 2024.01.21 [Commit#102d5081](https://github.com/RVC-Boss/GPT-SoVITS/commit/102d50819e5d24580d6e96085b636b25533ecc7f)
- Content: Cleaned up cached audio files and other files in the `TEMP` folder.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.01.22 [Commit#872134c8](https://github.com/RVC-Boss/GPT-SoVITS/commit/872134c846bcb8f1909a3f5aff68a6aa67643f68)
- Content: Fixed the issue where excessively short output files resulted in repeating the reference audio.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.22 Tested native support for English and Japanese training (Japanese training requires the root directory to be free of non-English special characters).
- 2024.01.22 [PR#124](https://github.com/RVC-Boss/GPT-SoVITS/pull/124)
- Content: Improved audio path checking. If an attempt is made to read from an incorrect input path, it will report that the path does not exist instead of an ffmpeg error.
- Type: Optimization
- Contributor: xmimu
- 2024.01.23 [Commit#93c47cd9](https://github.com/RVC-Boss/GPT-SoVITS/commit/93c47cd9f0c53439536eada18879b4ec5a812ae1)
- Content: Resolved the issue where Hubert extraction caused NaN errors, leading to SoVITS/GPT training ZeroDivisionError.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.23 [Commit#80fffb0a](https://github.com/RVC-Boss/GPT-SoVITS/commit/80fffb0ad46e4e7f27948d5a57c88cf342088d50)
- Content: Replaced `jieba` with `jieba_fast` for Chinese word segmentation.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.01.23 [Commit#63625758](https://github.com/RVC-Boss/GPT-SoVITS/commit/63625758a99e645f3218dd167924e01a0e3cf0dc)
- Content: Optimized model file sorting logic.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.01.23 [Commit#0c691191](https://github.com/RVC-Boss/GPT-SoVITS/commit/0c691191e894c15686e88279745712b3c6dc232f)
- Content: Added support for quick model switching in the inference WebUI.
- Type: Feature
- Contributor: RVC-Boss
- 2024.01.25 [Commit#249561e5](https://github.com/RVC-Boss/GPT-SoVITS/commit/249561e5a18576010df6587c274d38cbd9e18b4b)
- Content: Removed redundant logs in the inference WebUI.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.01.25 [PR#183](https://github.com/RVC-Boss/GPT-SoVITS/pull/183), [PR#200](https://github.com/RVC-Boss/GPT-SoVITS/pull/200)
- Content: Supported training and inference on Mac.
- Type: Feature
- Contributor: Lion-Wu
- 2024.01.26 [Commit#813cf96e](https://github.com/RVC-Boss/GPT-SoVITS/commit/813cf96e508ba1bb2c658f38c7cc77b797fb4082), [Commit#2d1ddeca](https://github.com/RVC-Boss/GPT-SoVITS/commit/2d1ddeca42db90c3fe2d0cd79480fd544d87f02b)
- Content: Fixed the issue of UVR5 reading and automatically jumping out of directories.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.26 [PR#204](https://github.com/RVC-Boss/GPT-SoVITS/pull/204)
- Content: Added support for Chinese-English mixed and Japanese-English mixed output texts.
- Type: Feature
- Contributor: Kakaru Hayate
- 2024.01.26 [Commit#f4148cf7](https://github.com/RVC-Boss/GPT-SoVITS/commit/f4148cf77fb899c22bcdd4e773d2f24ab34a73e7)
- Content: Added optional segmentation mode for output.
- Type: Feature
- Contributor: RVC-Boss
- 2024.01.26 [Commit#9fe955c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/9fe955c1bf5f94546c9f699141281f2661c8a180)
- Content: Fixed multiple newline issues causing inference errors.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.26 [Commit#84ee4719](https://github.com/RVC-Boss/GPT-SoVITS/commit/84ee471936b332bc2ccee024d6dfdedab4f0dc7b)
- Content: Automatically forced single precision for GPU that do not support half precision; enforced single precision under CPU inference.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.01.28 [PR#238](https://github.com/RVC-Boss/GPT-SoVITS/pull/238)
- Content: Completed model downloading process in the Dockerfile.
- Type: Fix
- Contributor: breakstring
- 2024.01.28 [PR#257](https://github.com/RVC-Boss/GPT-SoVITS/pull/257)
- Content: Fixed the issue with the pronunciation of numbers converting to Chinese characters.
- Type: Fix
- Contributor: duliangang
- 2024.01.28 [Commit#f0cfe397](https://github.com/RVC-Boss/GPT-SoVITS/commit/f0cfe397089a6fd507d678c71adeaab5e7ed0683)
- Content: Fixed the issue where GPT training did not save checkpoints.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.28 [Commit#b8ae5a27](https://github.com/RVC-Boss/GPT-SoVITS/commit/b8ae5a2761e2654fc0c905498009d3de9de745a8)
- Content: Excluded unreasonable reference audio lengths by setting restrictions.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.28 [Commit#698e9655](https://github.com/RVC-Boss/GPT-SoVITS/commit/698e9655132d194b25b86fbbc99d53c8d2cea2a3)
- Content: Fixed the issue where swallowing a few characters at the beginning of sentences.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.29 [Commit#ff977a5f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff977a5f5dc547e0ad82b9e0f1cd95fbc830b2b0)
- Content: Changed training configurations to single precision for GPUs like the 16 series, which have issues with half precision training.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.29 [Commit#172e139f](https://github.com/RVC-Boss/GPT-SoVITS/commit/172e139f45ac26723bc2cf7fac0112f69d6b46ec)
- Content: Tested and updated the available Colab version.
- Type: Feature
- Contributor: RVC-Boss
- 2024.01.29 [PR#135](https://github.com/RVC-Boss/GPT-SoVITS/pull/135)
- Content: Updated FunASR to Version 1.0 and fixed errors caused by interface misalignment.
- Type: Fix
- Contributor: LauraGPT
- 2024.01.30 [Commit#1c2fa98c](https://github.com/RVC-Boss/GPT-SoVITS/commit/1c2fa98ca8c325dcfb32797d22ff1c2a726d1cb4)
- Content: Fixed issues with splitting Chinese and English punctuation and added punctuation at the beginning and end of sentences.
- Type: Fix
- Contributor: RVC-Boss
- 2024.01.30 [Commit#74409f35](https://github.com/RVC-Boss/GPT-SoVITS/commit/74409f3570fa1c0ff28d4c65c288a6ce58ca00d2)
- Content: Added splitting by punctuation.
- Type: Feature
- Contributor: RVC-Boss
- 2024.01.30 [Commit#c42eeccf](https://github.com/RVC-Boss/GPT-SoVITS/commit/c42eeccfdd2d0a0d714ecc8bfc22a12373aca6b7)
- Content: Automatically removed double quotes from all path-related entries to prevent errors from novice users copying paths with double quotes.
- Type: Fix
- Contributor: RVC-Boss
## 202402
- 2024.02.01 [Commit#45f73519](https://github.com/RVC-Boss/GPT-SoVITS/commit/45f73519cc41cd17cf816d8b997a9dcb0bee04b6)
- Content: Fixed the issue where an ASR path ending with `/` caused an error in saving the filename.
- Type: Fix
- Contributor: RVC-Boss
- 2024.02.03 [Commit#dba1a74c](https://github.com/RVC-Boss/GPT-SoVITS/commit/dba1a74ccb0cf19a1b4eb93faf11d4ec2b1fc5d7)
- Content: Fixed the UVR5 format reading error causing separation failures.
- Type: Fix
- Contributor: RVC-Boss
- 2024.02.03 [Commit#3ebff70b](https://github.com/RVC-Boss/GPT-SoVITS/commit/3ebff70b71580ee1f97b3238c9442cbc5aef47c7)
- Content: Supported automatic segmentation and language recognition for mixed Chinese-Japanese-English texts.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.02.03 [PR#377](https://github.com/RVC-Boss/GPT-SoVITS/pull/377)
- Content: introduced PaddleSpeech's Normalizer to fix issues like reading "xx.xx%" (percent symbols) and "元/吨" being read as "元吨" instead of "元每吨", and fixed underscore errors.
- Type: Optimization
- Contributor: KamioRinn
- 2024.02.05 [PR#395](https://github.com/RVC-Boss/GPT-SoVITS/pull/395)
- Content: Optimized English text frontend.
- Type: Optimization
- Contributor: KamioRinn
- 2024.02.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/65b463a787f31637b4768cc9a47cab59541d3927)
- Content: Corrected language parameter confusion causing decreased Chinese inference quality.
- Type: Fix
- Contributor: RVC-Boss
- Related: [Issue#391](https://github.com/RVC-Boss/GPT-SoVITS/issues/391)
- 2024.02.06 [PR#403](https://github.com/RVC-Boss/GPT-SoVITS/pull/403)
- Content: Adapted UVR5 to higher versions of librosa.
- Type: Fix
- Contributor: StaryLan
- 2024.02.07 [Commit#14a28510](https://github.com/RVC-Boss/GPT-SoVITS/commit/14a285109a521679f8846589c22da8f656a46ad8)
- Content: Fixed UVR5 inf everywhere error caused by `is_half` parameter not converting to boolean, resulting in constant half precision inference, which caused `inf` on 16 series GPUs.
- Type: Fix
- Contributor: RVC-Boss
- 2024.02.07 [Commit#d74f888e](https://github.com/RVC-Boss/GPT-SoVITS/commit/d74f888e7ac86063bfeacef95d0e6ddafe42b3b2)
- Content: Fixed Gradio dependencies.
- Type: Fix
- Contributor: RVC-Boss
- 2024.02.07 [PR#400](https://github.com/RVC-Boss/GPT-SoVITS/pull/400)
- Content: Integrated Faster Whisper ASR for Japanese and English.
- Type: Feature
- Contributor: Shadow
- 2024.02.07 [Commit#6469048d](https://github.com/RVC-Boss/GPT-SoVITS/commit/6469048de12a8d6f0bd05d07f031309e61575a38)~[Commit#94ee71d9](https://github.com/RVC-Boss/GPT-SoVITS/commit/94ee71d9d562d10c9a1b96e745c6a6575aa66a10)
- Content: Supported automatic reading of `.list` full paths if the root directory is left blank during dataset preparation.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.02.08 [Commit#59f35ada](https://github.com/RVC-Boss/GPT-SoVITS/commit/59f35adad85815df27e9c6b33d420f5ebfd8376b)
- Content: Attempted to fix GPT training hang on Windows 10 1909 and Traditional Chinese System Language.
- Type: Fix
- Contributor: RVC-Boss
- Related: [Issue#232](https://github.com/RVC-Boss/GPT-SoVITS/issues/232)
- 2024.02.12 [PR#457](https://github.com/RVC-Boss/GPT-SoVITS/pull/457)
- Content: Enabled experimental DPO Loss training option to mitigate GPT repetition and missing characters by constructing negative samples during training and made several inference parameters available in the inference WebUI.
- Type: Feature
- Contributor: liufenghua
- 2024.02.12 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/2fa74ecb941db27d9015583a9be6962898d66730), [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/d82f6bbb98ba725e6725dcee99b80ce71fb0bf28)
- Content: Optimized logic for Faster Whisper and FunASR, switching Faster Whisper to mirror downloads to avoid issues with Hugging Face connections.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.02.15 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/dd2c4d6d7121bf82d29d0f0e4d788f3b231997c8)
- Content: Supported Chinese experiment names in training (previously caused errors).
- Type: Fix
- Contributor: RVC-Boss
- 2024.02.15 [Commit#ccb9b08b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ccb9b08be3c58e102defcc94ff4fd609da9e27ee)~[Commit#895fde46](https://github.com/RVC-Boss/GPT-SoVITS/commit/895fde46e420040ed26aaf0c5b7e99359d9b199b)
- Content: Made DPO training an optional feature instead of mandatory. If selected, the batch size is automatically halved. Fixed issues with new parameters not being passed in the inference WebUI.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.02.15 [Commit#7b0c3c67](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b0c3c676495c64b2064aa472bff14b5c06206a5)
- Content: Fixed bugs in Chinese frontend.
- Type: Fix
- Contributor: RVC-Boss
- 2024.02.16 [PR#499](https://github.com/RVC-Boss/GPT-SoVITS/pull/499)
- Content: Supported input without reference text.
- Type: Feature
- Contributor: Watchtower-Liu
- Related: [Issue#475](https://github.com/RVC-Boss/GPT-SoVITS/issues/475)
- 2024.02.17 [PR#509](https://github.com/RVC-Boss/GPT-SoVITS/pull/509), [PR#507](https://github.com/RVC-Boss/GPT-SoVITS/pull/507), [PR#532](https://github.com/RVC-Boss/GPT-SoVITS/pull/532), [PR#556](https://github.com/RVC-Boss/GPT-SoVITS/pull/556), [PR#559](https://github.com/RVC-Boss/GPT-SoVITS/pull/559)
- Content: Optimized Chinese and Japanese frontend processing.
- Type: Optimization
- Contributor: KamioRinn, v3cun
- 2024.02.17 [PR#510](https://github.com/RVC-Boss/GPT-SoVITS/pull/511), [PR#511](https://github.com/RVC-Boss/GPT-SoVITS/pull/511)
- Content: Fixed Colab public URL issue.
- Type: Fix
- Contributor: ChanningWang2018, RVC-Boss
- 2024.02.21 [PR#557](https://github.com/RVC-Boss/GPT-SoVITS/pull/557)
- Content: Switched Mac CPU inference to use CPU instead of MPS for faster performance.
- Type: Optimization
- Contributor: XXXXRT666
- 2024.02.21 [Commit#6da486c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/6da486c15d09e3d99fa42c5e560aaac56b6b4ce1), [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/5a17177342d2df1e11369f2f4f58d34a3feb1a35)
- Content: Added a noise reduction option during data processing (noise reduction leaves only 16kHz sampling rate; use only if the background noise is significant).
- Type: Feature
- Contributor: RVC-Boss
- 2024.02.28 [PR#573](https://github.com/RVC-Boss/GPT-SoVITS/pull/573)
- Content: Modified `is_half` check to ensure proper CPU inference on Mac.
- Type: Fix
- Contributor: XXXXRT666
- 2024.02.28 [PR#610](https://github.com/RVC-Boss/GPT-SoVITS/pull/610)
- Content: Fixed UVR5 reverb removal model where the setting was reversed.
- Type: Fix
- Contributor: Yuze Wang
## 202403
- 2024.03.06 [PR#675](https://github.com/RVC-Boss/GPT-SoVITS/pull/675)
- Content: Enabled automatic CPU inference for Faster Whisper if no CUDA is available.
- Type: Optimization
- Contributor: ShiroDoMain
- 2024.03.06 [Commit#616be20d](https://github.com/RVC-Boss/GPT-SoVITS/commit/616be20db3cf94f1cd663782fea61b2370704193)
- Content: No longer requires downloading the Chinese FunASR model first when using Faster Whisper non-Chinese ASR.
- Type: Optimization
- Contributor: RVC-Boss
- 2024.03.09 [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- Content: Accelerated inference by 50% (tested on RTX3090 + PyTorch 2.2.1 + CU11.8 + Win10 + Py39).
- Type: Optimization
- Contributor: GoHomeToMacDonal
- 2024.03.10 [PR#721](https://github.com/RVC-Boss/GPT-SoVITS/pull/721)
- Content: Added a quick inference branch `fast_inference_`.
- Type: Feature
- Contributor: ChasonJiang
- 2024.03.13 [PR#761](https://github.com/RVC-Boss/GPT-SoVITS/pull/761)
- Content: Supported CPU training, using CPU for training on macOS.
- Type: Feature
- Contributor: Lion-Wu
- 2024.03.19 [PR#804](https://github.com/RVC-Boss/GPT-SoVITS/pull/804), [PR#812](https://github.com/RVC-Boss/GPT-SoVITS/pull/812), [PR#821](https://github.com/RVC-Boss/GPT-SoVITS/pull/821)
- Content: Optimized the English text frontend.
- Type: Optimization
- Contributor: KamioRinn
- 2024.03.30 [PR#894](https://github.com/RVC-Boss/GPT-SoVITS/pull/894)
- Content: Improved API format.
- Type: Optimization
- Contributor: KamioRinn
## 202404
- 2024.04.03 [PR#917](https://github.com/RVC-Boss/GPT-SoVITS/pull/917)
- Content: Corrected FFmpeg command string formatting in UVR5 WebUI.
- Type: Fix
- Contributor: StaryLan
## 202405
- 2024.05.02 [PR#953](https://github.com/RVC-Boss/GPT-SoVITS/pull/953)
- Content: Fixed the issue of SoVITS training without freezing VQ (which could cause quality degradation).
- Type: Fix
- Contributor: hcwu1993
- Related: [Issue#747](https://github.com/RVC-Boss/GPT-SoVITS/issues/747)
- 2024.05.19 [PR#1102](https://github.com/RVC-Boss/GPT-SoVITS/pull/1102)
- Content: Added error prompts for unsupported languages during training data processing.
- Type: Optimization
- Contributor: StaryLan
- 2024.05.27 [PR#1132](https://github.com/RVC-Boss/GPT-SoVITS/pull/1132)
- Content: Fixed the bug in Hubert extraction.
- Type: Fix
- Contributor: XXXXRT666
## 202406
- 2024.06.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/99f09c8bdc155c1f4272b511940717705509582a)
- Content: Fixed the issue of WebUI's GPT fine-tuning not reading BERT feature of Chinese input texts, causing inconsistency with inference and potential quality degradation.
**Caution: If you have previously fine-tuned with a large amount of data, it is recommended to retune the model to improve quality.**
- Type: Fix
- Contributor: RVC-Boss
- 2024.06.07 [PR#1159](https://github.com/RVC-Boss/GPT-SoVITS/pull/1159)
- Content: Fixed progress bar logic for SoVITS training in `s2_train.py`.
- Type: Fix
- Contributor: pengzhendong
- 2024.06.10 [Commit#501a74ae](https://github.com/RVC-Boss/GPT-SoVITS/commit/501a74ae96789a26b48932babed5eb4e9483a232)
- Content: Fixed string formatting when UVR5 MDXNet calls FFmpeg, ensuring compatibility with paths containing spaces.
- Type: Fix
- Contributor: RVC-Boss
- 2024.06.10 [PR#1168](https://github.com/RVC-Boss/GPT-SoVITS/pull/1168), [PR#1169](https://github.com/RVC-Boss/GPT-SoVITS/pull/1169)
- Content: Improved the logic for pure punctuation and multi-punctuation text input.
- Type: Fix
- Contributor: XXXXRT666
- Related: [Issue#1165](https://github.com/RVC-Boss/GPT-SoVITS/issues/1165)
- 2024.06.13 [Commit#db506705](https://github.com/RVC-Boss/GPT-SoVITS/commit/db50670598f0236613eefa6f2d5a23a271d82041)
- Content: Fixed default batch size decimal issue in CPU inference.
- Type: Fix
- Contributor: RVC-Boss
- 2024.06.28 [PR#1258](https://github.com/RVC-Boss/GPT-SoVITS/pull/1258), [PR#1265](https://github.com/RVC-Boss/GPT-SoVITS/pull/1265), [PR#1267](https://github.com/RVC-Boss/GPT-SoVITS/pull/1267)
- Content: Fixed issues where denoising or ASR encountering exceptions would exit all pending audio files.
- Type: Fix
- Contributor: XXXXRT666
- 2024.06.29 [Commit#a208698e](https://github.com/RVC-Boss/GPT-SoVITS/commit/a208698e775155efc95b187b746d153d0f2847ca)
- Content: Fixed multi-process save logic for multi-GPU training.
- Type: Fix
- Contributor: RVC-Boss
- 2024.06.29 [PR#1251](https://github.com/RVC-Boss/GPT-SoVITS/pull/1251)
- Content: Removed redundant `my_utils.py`.
- Type: Optimization
- Contributor: aoguai
- Related: [Issue#1189](https://github.com/RVC-Boss/GPT-SoVITS/issues/1189)
## 202407
- 2024.07.06 [PR#1253](https://github.com/RVC-Boss/GPT-SoVITS/pull/1253)
- Content: Fixed the issue of splitting decimals when splitting by punctuation.
- Type: Fix
- Contributor: aoguai
- 2024.07.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/b0786f2998f1b2fce6678434524b4e0e8cc716f5)
- Content: The accelerated inference code has been validated and merged into the main branch, ensuring consistent inference effects with the base. It also supports accelerated inference in no-reference text mode.
- Type: Optimization
- Contributor: RVC-Boss, GoHomeToMacDonal
- Related: [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- Future updates will continue to verify the consistency of changes in the `fast_inference` branch.
- 2024.07.13 [PR#1294](https://github.com/RVC-Boss/GPT-SoVITS/pull/1294), [PR#1298](https://github.com/RVC-Boss/GPT-SoVITS/pull/1298)
- Content: Refactor i18n scanning and update multi-language configuration files.
- Type: Documentation
- Contributor: StaryLan
- 2024.07.13 [PR#1299](https://github.com/RVC-Boss/GPT-SoVITS/pull/1299)
- Content: Fixed issues where trailing slashes in user file paths caused command line errors.
- Type: Fix
- Contributor: XXXXRT666
- 2024.07.19 [PR#756](https://github.com/RVC-Boss/GPT-SoVITS/pull/756)
- Content: Fix the inconsistency in training steps when using a custom `bucket_sampler` during GPT training.
- Type: Fix
- Contributor: huangxu1991
- 2024.07.23 [Commit#9588a3c5](https://github.com/RVC-Boss/GPT-SoVITS/commit/9588a3c52d9ebdb20b3c5d74f647d12e7c1171c2), [PR#1340](https://github.com/RVC-Boss/GPT-SoVITS/pull/1340)
- Content: Support adjusting speech speed during synthesis, including an option to freeze randomness and only control speed. This feature has been updated to `api.py`.
- Type: Feature
- Contributor: RVC-Boss, 红血球AE3803
- 2024.07.27 [PR#1306](https://github.com/RVC-Boss/GPT-SoVITS/pull/1306), [PR#1356](https://github.com/RVC-Boss/GPT-SoVITS/pull/1356)
- Content: Added support for the BS-RoFormer vocal accompaniment separation model.
- Type: Feature
- Contributor: KamioRinn
- 2024.07.27 [PR#1351](https://github.com/RVC-Boss/GPT-SoVITS/pull/1351)
- Content: Improved Chinese text frontend.
- Type: Feature
- Contributor: KamioRinn
## 202408 (V2 Version)
- 2024.08.01 [PR#1355](https://github.com/RVC-Boss/GPT-SoVITS/pull/1355)
- Content: Automatically fill in the paths when processing files in the WebUI.
- Type: Chore
- Contributor: XXXXRT666
- 2024.08.01 [Commit#e62e9653](https://github.com/RVC-Boss/GPT-SoVITS/commit/e62e965323a60a76a025bcaa45268c1ddcbcf05c)
- Content: Enabled FP16 inference support for BS-Roformer.
- Type: Performance Optimization
- Contributor: RVC-Boss
- 2024.08.01 [Commit#bce451a2](https://github.com/RVC-Boss/GPT-SoVITS/commit/bce451a2d1641e581e200297d01f219aeaaf7299), [Commit#4c8b7612](https://github.com/RVC-Boss/GPT-SoVITS/commit/4c8b7612206536b8b4435997acb69b25d93acb78)
- Content: Optimized GPU recognition logic, added user-friendly logic to handle arbitrary GPU indices entered by users.
- Type: Chore
- Contributor: RVC-Boss
- 2024.08.02 [Commit#ff6c193f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff6c193f6fb99d44eea3648d82ebcee895860a22)~[Commit#de7ee7c7](https://github.com/RVC-Boss/GPT-SoVITS/commit/de7ee7c7c15a2ec137feb0693b4ff3db61fad758)
- Content: **Added GPT-SoVITS V2 model.**
- Type: Feature
- Contributor: RVC-Boss
- 2024.08.03 [Commit#8a101474](https://github.com/RVC-Boss/GPT-SoVITS/commit/8a101474b5a4f913b4c94fca2e3ca87d0771bae3)
- Content: Added support for Cantonese ASR by using FunASR.
- Type: Feature
- Contributor: RVC-Boss
- 2024.08.03 [PR#1387](https://github.com/RVC-Boss/GPT-SoVITS/pull/1387), [PR#1388](https://github.com/RVC-Boss/GPT-SoVITS/pull/1388)
- Content: Optimized UI and timing logic.
- Type: Chore
- Contributor: XXXXRT666
- 2024.08.06 [PR#1404](https://github.com/RVC-Boss/GPT-SoVITS/pull/1404), [PR#987](https://github.com/RVC-Boss/GPT-SoVITS/pull/987), [PR#488](https://github.com/RVC-Boss/GPT-SoVITS/pull/488)
- Content: Optimized polyphonic character handling logic (V2 Only).
- Type: Fix, Feature
- Contributor: KamioRinn, RVC-Boss
- 2024.08.13 [PR#1422](https://github.com/RVC-Boss/GPT-SoVITS/pull/1422)
- Content: Fixed bug where only one reference audio could be uploaded; added dataset validation with warning popups for missing files.
- Type: Fix, Chore
- Contributor: XXXXRT666
- 2024.08.20 [Issue#1508](https://github.com/RVC-Boss/GPT-SoVITS/issues/1508)
- Content: Upstream LangSegment library now supports optimizing numbers, phone numbers, dates, and times using SSML tags.
- Type: Feature
- Contributor: juntaosun
- 2024.08.20 [PR#1503](https://github.com/RVC-Boss/GPT-SoVITS/pull/1503)
- Content: Fixed and optimized API.
- Type: Fix
- Contributor: KamioRinn
- 2024.08.20 [PR#1490](https://github.com/RVC-Boss/GPT-SoVITS/pull/1490)
- Content: Merged `fast_inference` branch into the main branch.
- Type: Refactor
- Contributor: ChasonJiang
- 2024.08.21 **Officially released GPT-SoVITS V2 version.**
## 202502 (V3 Version)
- 2025.02.11 [Commit#ed207c4b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ed207c4b879d5296e9be3ae5f7b876729a2c43b8)~[Commit#6e2b4918](https://github.com/RVC-Boss/GPT-SoVITS/commit/6e2b49186c5b961f0de41ea485d398dffa9787b4)
- Content: **Added GPT-SoVITS V3 model, which requires 14GB VRAM for fine-tuning.**
- Type: Feature (Refer to [Wiki](https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)))
- Contributor: RVC-Boss
- 2025.02.12 [PR#2032](https://github.com/RVC-Boss/GPT-SoVITS/pull/2032)
- Content: Updated multilingual project documentation.
- Type: Documentation
- Contributor: StaryLan
- 2025.02.12 [PR#2033](https://github.com/RVC-Boss/GPT-SoVITS/pull/2033)
- Content: Updated Japanese documentation.
- Type: Documentation
- Contributor: Fyphen
- 2025.02.12 [PR#2010](https://github.com/RVC-Boss/GPT-SoVITS/pull/2010)
- Content: Optimized attention calculation logic.
- Type: Performance Optimization
- Contributor: wzy3650
- 2025.02.12 [PR#2040](https://github.com/RVC-Boss/GPT-SoVITS/pull/2040)
- Content: Added gradient checkpointing support for fine-tuning, requiring 12GB VRAM.
- Type: Feature
- Contributor: Kakaru Hayate
- 2025.02.14 [PR#2047](https://github.com/RVC-Boss/GPT-SoVITS/pull/2047), [PR#2062](https://github.com/RVC-Boss/GPT-SoVITS/pull/2062), [PR#2073](https://github.com/RVC-Boss/GPT-SoVITS/pull/2073)
- Content: Switched to a new language segmentation tool, improved multilingual mixed-text splitting strategy, and optimized number and English processing logic.
- Type: Feature
- Contributor: KamioRinn
- 2025.02.23 [Commit#56509a17](https://github.com/RVC-Boss/GPT-SoVITS/commit/56509a17c918c8d149c48413a672b8ddf437495b)~[Commit#514fb692](https://github.com/RVC-Boss/GPT-SoVITS/commit/514fb692db056a06ed012bc3a5bca2a5b455703e)
- Content: **GPT-SoVITS V3 model now supports LoRA training, requiring 8GB GPU Memory for fine-tuning.**
- Type: Feature
- Contributor: RVC-Boss
- 2025.02.23 [PR#2078](https://github.com/RVC-Boss/GPT-SoVITS/pull/2078)
- Content: Added Mel Band Roformer model support for vocal and Instrument separation.
- Type: Feature
- Contributor: Sucial
- 2025.02.26 [PR#2112](https://github.com/RVC-Boss/GPT-SoVITS/pull/2112), [PR#2114](https://github.com/RVC-Boss/GPT-SoVITS/pull/2114)
- Content: Fixed MeCab error under Chinese paths (specifically for Japanese/Korean or multilingual text splitting).
- Type: Fix
- Contributor: KamioRinn
- 2025.02.27 [Commit#92961c3f](https://github.com/RVC-Boss/GPT-SoVITS/commit/92961c3f68b96009ff2cd00ce614a11b6c4d026f)~[Commit#250b1c73](https://github.com/RVC-Boss/GPT-SoVITS/commit/250b1c73cba60db18148b21ec5fbce01fd9d19bc)
- Content: **Added 24kHz to 48kHz audio super-resolution models** to alleviate the "muffled" audio issue when generating 24K audio with V3 model.
- Type: Feature
- Contributor: RVC-Boss
- Related: [Issue#2085](https://github.com/RVC-Boss/GPT-SoVITS/issues/2085), [Issue#2117](https://github.com/RVC-Boss/GPT-SoVITS/issues/2117)
- 2025.02.28 [PR#2123](https://github.com/RVC-Boss/GPT-SoVITS/pull/2123)
- Content: Updated multilingual project documentation.
- Type: Documentation
- Contributor: StaryLan
- 2025.02.28 [PR#2122](https://github.com/RVC-Boss/GPT-SoVITS/pull/2122)
- Content: Applied rule-based detection for short CJK characters when model cannot identify them.
- Type: Fix
- Contributor: KamioRinn
- Related: [Issue#2116](https://github.com/RVC-Boss/GPT-SoVITS/issues/2116)
- 2025.02.28 [Commit#c38b1690](https://github.com/RVC-Boss/GPT-SoVITS/commit/c38b16901978c1db79491e16905ea3a37a7cf686), [Commit#a32a2b89](https://github.com/RVC-Boss/GPT-SoVITS/commit/a32a2b893436fad56cc82409121c7fa36a1815d5)
- Content: Added speech rate parameter to control synthesis speed.
- Type: Fix
- Contributor: RVC-Boss
- 2025.02.28 **Officially released GPT-SoVITS V3**.
## 202503
- 2025.03.31 [PR#2236](https://github.com/RVC-Boss/GPT-SoVITS/pull/2236)
- Content: Fixed issues caused by incorrect versions of dependencies.
- Type: Fix
- Contributor: XXXXRT666
- Related:
- PyOpenJTalk: [Issue#1131](https://github.com/RVC-Boss/GPT-SoVITS/issues/1131), [Issue#2231](https://github.com/RVC-Boss/GPT-SoVITS/issues/2231), [Issue#2233](https://github.com/RVC-Boss/GPT-SoVITS/issues/2233).
- ONNX: [Issue#492](https://github.com/RVC-Boss/GPT-SoVITS/issues/492), [Issue#671](https://github.com/RVC-Boss/GPT-SoVITS/issues/671), [Issue#1192](https://github.com/RVC-Boss/GPT-SoVITS/issues/1192), [Issue#1819](https://github.com/RVC-Boss/GPT-SoVITS/issues/1819), [Issue#1841](https://github.com/RVC-Boss/GPT-SoVITS/issues/1841).
- Pydantic: [Issue#2230](https://github.com/RVC-Boss/GPT-SoVITS/issues/2230), [Issue#2239](https://github.com/RVC-Boss/GPT-SoVITS/issues/2239).
- PyTorch-Lightning: [Issue#2174](https://github.com/RVC-Boss/GPT-SoVITS/issues/2174).
- 2025.03.31 [PR#2241](https://github.com/RVC-Boss/GPT-SoVITS/pull/2241)
- Content: **Enabled parallel inference for SoVITS v3.**
- Type: Feature
- Contributor: ChasonJiang
- Fixed other minor bugs.
- Integrated package fixes for ONNX runtime GPU inference support:
- Type: Fix
- Details:
- ONNX models within G2PW switched from CPU to GPU inference, significantly reducing CPU bottleneck;
- foxjoy dereverberation model now supports GPU inference.
## 202504 (V4 Version)
- 2025.04.01 [Commit#6a60e5ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/6a60e5edb1817af4a61c7a5b196c0d0f1407668f)
- Content: Unlocked SoVITS v3 parallel inference; fixed asynchronous model loading logic.
- Type: Fix
- Contributor: RVC-Boss
- 2025.04.07 [PR#2255](https://github.com/RVC-Boss/GPT-SoVITS/pull/2255)
- Content: Code formatting using Ruff; updated G2PW link.
- Type: Style
- Contributor: XXXXRT666
- 2025.04.15 [PR#2290](https://github.com/RVC-Boss/GPT-SoVITS/pull/2290)
- Content: Cleaned up documentation; added Python 3.11 support; updated installers.
- Type: Chore
- Contributor: XXXXRT666
- 2025.04.20 [PR#2300](https://github.com/RVC-Boss/GPT-SoVITS/pull/2300)
- Content: Updated Colab, installation files, and model downloads.
- Type: Chore
- Contributor: XXXXRT666
- 2025.04.20 [Commit#e0c452f0](https://github.com/RVC-Boss/GPT-SoVITS/commit/e0c452f0078e8f7eb560b79a54d75573fefa8355)~[Commit#9d481da6](https://github.com/RVC-Boss/GPT-SoVITS/commit/9d481da610aa4b0ef8abf5651fd62800d2b4e8bf)
- Content: **Added GPT-SoVITS V4 model.**
- Type: Feature
- Contributor: RVC-Boss
- 2025.04.21 [Commit#8b394a15](https://github.com/RVC-Boss/GPT-SoVITS/commit/8b394a15bce8e1d85c0b11172442dbe7a6017ca2)~[Commit#bc2fe5ec](https://github.com/RVC-Boss/GPT-SoVITS/commit/bc2fe5ec86536c77bb3794b4be263ac87e4fdae6), [PR#2307](https://github.com/RVC-Boss/GPT-SoVITS/pull/2307)
- Content: Enabled parallel inference for V4.
- Type: Feature
- Contributor: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#7405427a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7405427a0ab2a43af63205df401fd6607a408d87)~[Commit#590c83d7](https://github.com/RVC-Boss/GPT-SoVITS/commit/590c83d7667c8d4908f5bdaf2f4c1ba8959d29ff), [PR#2309](https://github.com/RVC-Boss/GPT-SoVITS/pull/2309)
- Content: Fixed model version parameter passing.
- Type: Fix
- Contributor: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#fbdab94e](https://github.com/RVC-Boss/GPT-SoVITS/commit/fbdab94e17d605d85841af6f94f40a45976dd1d9), [PR#2310](https://github.com/RVC-Boss/GPT-SoVITS/pull/2310)
- Content: Fixed Numpy and Numba version mismatch issue; updated librosa version.
- Type: Fix
- Contributor: RVC-Boss, XXXXRT666
- Related: [Issue#2308](https://github.com/RVC-Boss/GPT-SoVITS/issues/2308)
- **2024.04.22 Officially released GPT-SoVITS V4**.
- 2025.04.22 [PR#2311](https://github.com/RVC-Boss/GPT-SoVITS/pull/2311)
- Content: Updated Gradio parameters.
- Type: Chore
- Contributor: XXXXRT666
- 2025.04.25 [PR#2322](https://github.com/RVC-Boss/GPT-SoVITS/pull/2322)
- Content: Improved Colab/Kaggle notebook scripts.
- Type: Chore
- Contributor: XXXXRT666
## 202505
- 2025.05.26 [PR#2351](https://github.com/RVC-Boss/GPT-SoVITS/pull/2351)
- Content: Improved Docker and Windows auto-build scripts; added pre-commit formatting.
- Type: Chore
- Contributor: XXXXRT666
- 2025.05.26 [PR#2408](https://github.com/RVC-Boss/GPT-SoVITS/pull/2408)
- Content: Optimized multilingual text splitting and recognition logic.
- Type: Fix
- Contributor: KamioRinn
- Related: [Issue#2404](https://github.com/RVC-Boss/GPT-SoVITS/issues/2404)
- 2025.05.26 [PR#2377](https://github.com/RVC-Boss/GPT-SoVITS/pull/2377)
- Content: Implemented caching strategies to improve SoVITS V3/V4 inference speed by 10%.
- Type: Performance Optimization
- Contributor: Kakaru Hayate
- 2025.05.26 [Commit#4d9d56b1](https://github.com/RVC-Boss/GPT-SoVITS/commit/4d9d56b19638dc434d6eefd9545e4d8639a3e072), [Commit#8c705784](https://github.com/RVC-Boss/GPT-SoVITS/commit/8c705784c50bf438c7b6d0be33a9e5e3cb90e6b2), [Commit#fafe4e7f](https://github.com/RVC-Boss/GPT-SoVITS/commit/fafe4e7f120fba56c5f053c6db30aa675d5951ba)
- Content: Updated the annotation interface with a reminder: click Submit Text after completing each page, or changes will not be saved.
- Type: Fix
- Contributor: RVC-Boss
- 2025.05.29 [Commit#1934fc1e](https://github.com/RVC-Boss/GPT-SoVITS/commit/1934fc1e1b22c4c162bba1bbe7d7ebb132944cdc)
- Content: Fixed UVR5 and ONNX dereverberation model errors when FFmpeg encodes MP3/M4A files with spaces in original paths.
- Type: Fix
- Contributor: RVC-Boss
## 202506 (V2Pro Series)
- 2025.06.03 [PR#2420](https://github.com/RVC-Boss/GPT-SoVITS/pull/2420)
- Content: Updated multilingual project documentation.
- Type: Documentation
- Contributor: StaryLan
- 2025.06.04 [PR#2417](https://github.com/RVC-Boss/GPT-SoVITS/pull/2417)
- Content: Support exporting V4 with TorchScript.
- Type: Feature
- Contributor: L-jasmine
- 2025.06.04 [Commit#b7c0c5ca](https://github.com/RVC-Boss/GPT-SoVITS/commit/b7c0c5ca878bcdd419fd86bf80dba431a6653356)~[Commit#298ebb03](https://github.com/RVC-Boss/GPT-SoVITS/commit/298ebb03c5a719388527ae6a586c7ea960344e70)
- Content: **Added GPT-SoVITS V2Pro Series model (V2Pro, V2ProPlus).**.
- Type: Feature
- Contributor: RVC-Boss
- 2025.06.05 [PR#2426](https://github.com/RVC-Boss/GPT-SoVITS/pull/2426)
- Description: Fix initialization error in `config/inference_webui`.
- Type: Fix
- Contributor: StaryLan
- 2025.06.05 [PR#2427](https://github.com/RVC-Boss/GPT-SoVITS/pull/2427), [Commit#7d70852a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7d70852a3f67c3b52e3a62857f8663d529efc8cd), [PR#2434](https://github.com/RVC-Boss/GPT-SoVITS/pull/2434)
- Content: Optimized automatic precision detection logic; added collapsible functionality to WebUI frontend modules.
- Type: New Feature
- Contributors: XXXXRT666, RVC-Boss

View File

@@ -0,0 +1,580 @@
# 更新履歴
## 202401
- 2024.01.21 [PR#108](https://github.com/RVC-Boss/GPT-SoVITS/pull/108)
- 内容: WebUIに英語システム翻訳サポートを追加。
- タイプ: ドキュメンテーション
- 貢献者: D3lik
- 2024.01.21 [Commit#7b89c9ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b89c9ed5669f63c4ed6ae791408969640bdcf3e)
- 内容: SoVITSトレーニングのZeroDivisionError修正を試みた。
- タイプ: 修正
- 貢献者: RVC-Boss, Tybost
- 関連: [Issue#79](https://github.com/RVC-Boss/GPT-SoVITS/issues/79)
- 2024.01.21 [Commit#ea62d6e0](https://github.com/RVC-Boss/GPT-SoVITS/commit/ea62d6e0cf1efd75287766ea2b55d1c3b69b4fd3)
- 内容: 合成音声に参照音声の終端が含まれる問題を大幅に軽減。
- タイプ: 最適化
- 貢献者: RVC-Boss
- 2024.01.21 [Commit#a87ad522](https://github.com/RVC-Boss/GPT-SoVITS/commit/a87ad5228ed2d729da42019ae1b93171f6a745ef)
- 内容: `cmd-asr.py`がデフォルトディレクトリにFunASRモデルが含まれているか確認し、ない場合はModelScopeからダウンロードするようになった。
- タイプ: 機能
- 貢献者: RVC-Boss
- 2024.01.21 [Commit#f6147116](https://github.com/RVC-Boss/GPT-SoVITS/commit/f61471166c107ba56ccb7a5137fa9d7c09b2830d)
- 内容: `Config.py``is_share`パラメータを追加、`True`に設定するとWebUIを公開ネットワークにマッピング可能。
- タイプ: 機能
- 貢献者: RVC-Boss
- 2024.01.21 [Commit#102d5081](https://github.com/RVC-Boss/GPT-SoVITS/commit/102d50819e5d24580d6e96085b636b25533ecc7f)
- 内容: `TEMP`フォルダ内のキャッシュ音声ファイルやその他ファイルをクリーンアップ。
- タイプ: 最適化
- 貢献者: RVC-Boss
- 2024.01.22 [Commit#872134c8](https://github.com/RVC-Boss/GPT-SoVITS/commit/872134c846bcb8f1909a3f5aff68a6aa67643f68)
- 内容: 極端に短い出力ファイルで参照音声が繰り返される問題を修正。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.22 英語と日本語トレーニングのネイティブサポートをテスト(日本語トレーニングはルートディレクトリに非英語特殊文字がないことが必要)。
- 2024.01.22 [PR#124](https://github.com/RVC-Boss/GPT-SoVITS/pull/124)
- 内容: 音声パスチェックを改善。不正な入力パスから読み取ろうとすると、ffmpegエラーではなくパスが存在しないと報告するようになった。
- タイプ: 最適化
- 貢献者: xmimu
- 2024.01.23 [Commit#93c47cd9](https://github.com/RVC-Boss/GPT-SoVITS/commit/93c47cd9f0c53439536eada18879b4ec5a812ae1)
- 内容: Hubert抽出がNaNエラーを引き起こし、SoVITS/GPTトレーニングでZeroDivisionErrorが発生する問題を解決。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.23 [Commit#80fffb0a](https://github.com/RVC-Boss/GPT-SoVITS/commit/80fffb0ad46e4e7f27948d5a57c88cf342088d50)
- 内容: 中国語分ツール`jieba``jieba_fast`に置き換え。
- タイプ: 最適化
- 貢献者: RVC-Boss
- 2024.01.23 [Commit#63625758](https://github.com/RVC-Boss/GPT-SoVITS/commit/63625758a99e645f3218dd167924e01a0e3cf0dc)
- 内容: モデルファイルのソートロジックを最適化。
- タイプ: 最適化
- 貢献者: RVC-Boss
- 2024.01.23 [Commit#0c691191](https://github.com/RVC-Boss/GPT-SoVITS/commit/0c691191e894c15686e88279745712b3c6dc232f)
- 内容: 推論WebUIでクイックモデル切り替えをサポート追加。
- タイプ: 機能
- 貢献者: RVC-Boss
- 2024.01.25 [Commit#249561e5](https://github.com/RVC-Boss/GPT-SoVITS/commit/249561e5a18576010df6587c274d38cbd9e18b4b)
- 内容: 推論WebUIの冗長なログを削除。
- タイプ: 最適化
- 貢献者: RVC-Boss
- 2024.01.25 [PR#183](https://github.com/RVC-Boss/GPT-SoVITS/pull/183), [PR#200](https://github.com/RVC-Boss/GPT-SoVITS/pull/200)
- 内容: Macでのトレーニングと推論をサポート。
- タイプ: 機能
- 貢献者: Lion-Wu
- 2024.01.26 [Commit#813cf96e](https://github.com/RVC-Boss/GPT-SoVITS/commit/813cf96e508ba1bb2c658f38c7cc77b797fb4082), [Commit#2d1ddeca](https://github.com/RVC-Boss/GPT-SoVITS/commit/2d1ddeca42db90c3fe2d0cd79480fd544d87f02b)
- 内容: UVR5の読み取り時にディレクトリが自動的に飛び出す問題を修正。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.26 [PR#204](https://github.com/RVC-Boss/GPT-SoVITS/pull/204)
- 内容: 中日混合および日英混合出力テキストをサポート追加。
- タイプ: 機能
- 貢献者: Kakaru Hayate
- 2024.01.26 [Commit#f4148cf7](https://github.com/RVC-Boss/GPT-SoVITS/commit/f4148cf77fb899c22bcdd4e773d2f24ab34a73e7)
- 内容: 出力のセグメンテーションモードをオプションで追加。
- タイプ: 機能
- 貢献者: RVC-Boss
- 2024.01.26 [Commit#9fe955c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/9fe955c1bf5f94546c9f699141281f2661c8a180)
- 内容: 複数改行による推論エラーを修正。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.26 [Commit#84ee4719](https://github.com/RVC-Boss/GPT-SoVITS/commit/84ee471936b332bc2ccee024d6dfdedab4f0dc7b)
- 内容: 半精度をサポートしないGPU向けに自動的に単精度を強制。CPU推論時も単精度を強制。
- タイプ: 最適化
- 貢献者: RVC-Boss
- 2024.01.28 [PR#238](https://github.com/RVC-Boss/GPT-SoVITS/pull/238)
- 内容: Dockerfile内のモデルダウンロードプロセスを完了。
- タイプ: 修正
- 貢献者: breakstring
- 2024.01.28 [PR#257](https://github.com/RVC-Boss/GPT-SoVITS/pull/257)
- 内容: 数字の発音が中国語文字に変換される問題を修正。
- タイプ: 修正
- 貢献者: duliangang
- 2024.01.28 [Commit#f0cfe397](https://github.com/RVC-Boss/GPT-SoVITS/commit/f0cfe397089a6fd507d678c71adeaab5e7ed0683)
- 内容: GPTトレーニングがチェックポイントを保存しない問題を修正。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.28 [Commit#b8ae5a27](https://github.com/RVC-Boss/GPT-SoVITS/commit/b8ae5a2761e2654fc0c905498009d3de9de745a8)
- 内容: 制限を設定して不合理な参照音声長を除外。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.28 [Commit#698e9655](https://github.com/RVC-Boss/GPT-SoVITS/commit/698e9655132d194b25b86fbbc99d53c8d2cea2a3)
- 内容: 文頭の数文字が飲み込まれる問題を修正。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.29 [Commit#ff977a5f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff977a5f5dc547e0ad82b9e0f1cd95fbc830b2b0)
- 内容: 16シリーズのような半精度トレーニングに問題があるGPU向けに、トレーニング設定を単精度に変更。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.29 [Commit#172e139f](https://github.com/RVC-Boss/GPT-SoVITS/commit/172e139f45ac26723bc2cf7fac0112f69d6b46ec)
- 内容: 利用可能なColabバージョンをテストして更新。
- タイプ: 機能
- 貢献者: RVC-Boss
- 2024.01.29 [PR#135](https://github.com/RVC-Boss/GPT-SoVITS/pull/135)
- 内容: FunASRをバージョン1.0に更新し、インターフェース不一致によるエラーを修正。
- タイプ: 修正
- 貢献者: LauraGPT
- 2024.01.30 [Commit#1c2fa98c](https://github.com/RVC-Boss/GPT-SoVITS/commit/1c2fa98ca8c325dcfb32797d22ff1c2a726d1cb4)
- 内容: 中日英句読点の分割問題を修正し、文の始めと終わりに句読点を追加。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.01.30 [Commit#74409f35](https://github.com/RVC-Boss/GPT-SoVITS/commit/74409f3570fa1c0ff28d4c65c288a6ce58ca00d2)
- 内容: 句読点による分割を追加。
- タイプ: 機能
- 貢献者: RVC-Boss
- 2024.01.30 [Commit#c42eeccf](https://github.com/RVC-Boss/GPT-SoVITS/commit/c42eeccfdd2d0a0d714ecc8bfc22a12373aca6b7)
- 内容: パスと関連する文字列を解析して、二重引用符を自動的に削除します.また、パスをコピーする場合、二重引用符が含まれていてもエラーが発生しません.
- タイプ: 修正
- 貢献者: RVC-Boss
## 202402
- 2024.02.01 [Commit#45f73519](https://github.com/RVC-Boss/GPT-SoVITS/commit/45f73519cc41cd17cf816d8b997a9dcb0bee04b6)
- 内容: ASRパス末尾のスラッシュによるファイル名保存エラーの修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.02.03 [Commit#dba1a74c](https://github.com/RVC-Boss/GPT-SoVITS/commit/dba1a74ccb0cf19a1b4eb93faf11d4ec2b1fc5d7)
- 内容: UVR5のフォーマット読み取りエラーによる音声分離失敗の修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.02.03 [Commit#3ebff70b](https://github.com/RVC-Boss/GPT-SoVITS/commit/3ebff70b71580ee1f97b3238c9442cbc5aef47c7)
- 内容: 中日英混合テキストの自動言語判別・分割機能のサポート
- タイプ: 機能改善
- 貢献者: RVC-Boss
- 2024.02.03 [PR#377](https://github.com/RVC-Boss/GPT-SoVITS/pull/377)
- 内容: PaddleSpeechのテキスト正規化を導入(例: xx.xx%表記、「元/吨」を「元每吨」と正確に読む、アンダースコア問題の解消)
- タイプ: 機能改善
- 貢献者: KamioRinn
- 2024.02.05 [PR#395](https://github.com/RVC-Boss/GPT-SoVITS/pull/395)
- 内容: 英語テキスト前処理の最適化
- タイプ: 機能改善
- 貢献者: KamioRinn
- 2024.02.06 [Commit#65b463a7](https://github.com/RVC-Boss/GPT-SoVITS/commit/65b463a787f31637b4768cc9a47cab59541d3927)
- 内容: 言語パラメータの混乱による中国語推論精度低下の修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 関連: [Issue#391](https://github.com/RVC-Boss/GPT-SoVITS/issues/391)
- 2024.02.06 [PR#403](https://github.com/RVC-Boss/GPT-SoVITS/pull/403)
- 内容: UVR5の高バージョンLibrosaへの対応
- タイプ: 修正
- 貢献者: StaryLan
- 2024.02.07 [Commit#14a28510](https://github.com/RVC-Boss/GPT-SoVITS/commit/14a285109a521679f8846589c22da8f656a46ad8)
- 内容: UVR5の「inf everywhere」エラーの修正(ブール型変換不足による半精度推論問題、16シリーズGPUで発生)
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.02.07 [Commit#d74f888e](https://github.com/RVC-Boss/GPT-SoVITS/commit/d74f888e7ac86063bfeacef95d0e6ddafe42b3b2)
- 内容: Gradio依存関係の修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.02.07 [PR#400](https://github.com/RVC-Boss/GPT-SoVITS/pull/400)
- 内容: Faster Whisperの統合による日本語・英語音声認識機能の追加
- タイプ: 新機能
- 貢献者: Shadow
- 2024.02.07 [Commit#6469048d](https://github.com/RVC-Boss/GPT-SoVITS/commit/6469048de12a8d6f0bd05d07f031309e61575a38)[Commit#94ee71d9](https://github.com/RVC-Boss/GPT-SoVITS/commit/94ee71d9d562d10c9a1b96e745c6a6575aa66a10)
- 内容: 3連ルートディレクトリ空欄時の自動.listファイルパス読み込み機能
- タイプ: 機能改善
- 貢献者: RVC-Boss
- 2024.02.08 [Commit#59f35ada](https://github.com/RVC-Boss/GPT-SoVITS/commit/59f35adad85815df27e9c6b33d420f5ebfd8376b)
- 内容: GPTトレーニングのフリーズ問題(Windows10 1909)と繁体字システム言語時のエラー修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 関連: [Issue#232](https://github.com/RVC-Boss/GPT-SoVITS/issues/232)
- 2024.02.12 [PR#457](https://github.com/RVC-Boss/GPT-SoVITS/pull/457)
- 内容: DPO損失実験的トレーニングオプションの追加(ネガティブサンプル構築によるGPTの繰り返し・文字抜け問題改善)、推論インターフェースの複数パラメータ公開
- タイプ: 新機能
- 貢献者: liufenghua
- 2024.02.12 [Commit#2fa74ecb](https://github.com/RVC-Boss/GPT-SoVITS/commit/2fa74ecb941db27d9015583a9be6962898d66730), [Commit#d82f6bbb](https://github.com/RVC-Boss/GPT-SoVITS/commit/d82f6bbb98ba725e6725dcee99b80ce71fb0bf28)
- 内容: 音声認識ロジックの最適化、Faster Whisperのミラーサイトダウンロード対応(HuggingFace接続問題回避)
- タイプ: 機能改善
- 貢献者: RVC-Boss
- 2024.02.15 [Commit#dd2c4d6d](https://github.com/RVC-Boss/GPT-SoVITS/commit/dd2c4d6d7121bf82d29d0f0e4d788f3b231997c8)
- 内容: 中国語実験名のトレーニングサポート
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.02.15 [Commit#ccb9b08b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ccb9b08be3c58e102defcc94ff4fd609da9e27ee)[Commit#895fde46](https://github.com/RVC-Boss/GPT-SoVITS/commit/895fde46e420040ed26aaf0c5b7e99359d9b199b)
- 内容: DPOトレーニングを必須からオプションに変更(選択時は自動的にバッチサイズ半減)、推論インターフェースの新パラメータ未伝達問題の修正
- タイプ: 機能改善
- 貢献者: RVC-Boss
- 2024.02.15 [Commit#7b0c3c67](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b0c3c676495c64b2064aa472bff14b5c06206a5)
- 内容: 中国語テキスト前処理エラーの修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.02.16 [PR#499](https://github.com/RVC-Boss/GPT-SoVITS/pull/499)
- 内容: 参照テキストなし入力のサポート
- タイプ: 新機能
- 貢献者: Watchtower-Liu
- 関連: [Issue#475](https://github.com/RVC-Boss/GPT-SoVITS/issues/475)
- 2024.02.17 [PR#509](https://github.com/RVC-Boss/GPT-SoVITS/pull/509), [PR#507](https://github.com/RVC-Boss/GPT-SoVITS/pull/507), [PR#532](https://github.com/RVC-Boss/GPT-SoVITS/pull/532), [PR#556](https://github.com/RVC-Boss/GPT-SoVITS/pull/556), [PR#559](https://github.com/RVC-Boss/GPT-SoVITS/pull/559)
- 内容: 中国語・日本語前処理の改善
- タイプ: 機能改善
- 貢献者: KamioRinn, v3cun
- 2024.02.17 [PR#510](https://github.com/RVC-Boss/GPT-SoVITS/pull/511), [PR#511](https://github.com/RVC-Boss/GPT-SoVITS/pull/511)
- 内容: Colabの公開URL未開始問題の修正
- タイプ: 修正
- 貢献者: ChanningWang2018, RVC-Boss
- 2024.02.21 [PR#557](https://github.com/RVC-Boss/GPT-SoVITS/pull/557)
- 内容: MacOS推論デバイスをMPSからCPUに変更(CPU推論の方が高速)
- タイプ: 機能改善
- 貢献者: XXXXRT666
- 2024.02.21 [Commit#6da486c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/6da486c15d09e3d99fa42c5e560aaac56b6b4ce1), [Commit#5a171773](https://github.com/RVC-Boss/GPT-SoVITS/commit/5a17177342d2df1e11369f2f4f58d34a3feb1a35)
- 内容: データ前処理に音声ノイズ除去オプション追加(16Kサンプルレートにダウンサンプリング、高イズ時以外は非推奨)
- タイプ: 新機能
- 貢献者: RVC-Boss
- 2024.02.28 [PR#573](https://github.com/RVC-Boss/GPT-SoVITS/pull/573)
- 内容: is_half判定の修正によるMacOSの正常なCPU推論対応
- タイプ: 修正
- 貢献者: XXXXRT666
- 2024.02.28 [PR#610](https://github.com/RVC-Boss/GPT-SoVITS/pull/610)
- 内容: UVR5 MDXNetパラメータ順序エラーによる出力フォルダ逆転問題の修正
- タイプ: 修正
- 貢献者: Yuze Wang
## 202403
- 2024.03.06 [PR#675](https://github.com/RVC-Boss/GPT-SoVITS/pull/675)
- 内容: CUDAが利用できない場合、Faster Whisperの自動CPU推論を有効化
- タイプ: 機能改善
- 貢献者: ShiroDoMain
- 2024.03.06 [Commit#616be20d](https://github.com/RVC-Boss/GPT-SoVITS/commit/616be20db3cf94f1cd663782fea61b2370704193)
- 内容: Faster Whisper非中国語ASR使用時、中国語FunASRモデルの事前ダウンロードが不要に
- タイプ: 機能改善
- 貢献者: RVC-Boss
- 2024.03.09 [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- 内容: 推論速度を50%向上RTX3090 + PyTorch 2.2.1 + CU11.8 + Win10 + Py39環境で検証
- タイプ: 機能改善
- 貢献者: GoHomeToMacDonal
- 2024.03.10 [PR#721](https://github.com/RVC-Boss/GPT-SoVITS/pull/721)
- 内容: 高速推論ブランチ`fast_inference_`を追加
- タイプ: 新機能
- 貢献者: ChasonJiang
- 2024.03.13 [PR#761](https://github.com/RVC-Boss/GPT-SoVITS/pull/761)
- 内容: CPUトレーニングをサポートmacOSでCPUを使用したトレーニングが可能に
- タイプ: 新機能
- 貢献者: Lion-Wu
- 2024.03.19 [PR#804](https://github.com/RVC-Boss/GPT-SoVITS/pull/804), [PR#812](https://github.com/RVC-Boss/GPT-SoVITS/pull/812), [PR#821](https://github.com/RVC-Boss/GPT-SoVITS/pull/821)
- 内容: 英語テキストフロントエンドの最適化
- タイプ: 機能改善
- 貢献者: KamioRinn
- 2024.03.30 [PR#894](https://github.com/RVC-Boss/GPT-SoVITS/pull/894)
- 内容: APIフォーマットの改善
- タイプ: 機能改善
- 貢献者: KamioRinn
## 202404
- 2024.04.03 [PR#917](https://github.com/RVC-Boss/GPT-SoVITS/pull/917)
- 内容: UVR5 WebUIにおけるFFmpegコマンド文字列フォーマットの修正
- タイプ: 修正
- 貢献者: StaryLan
## 202405
- 2024.05.02 [PR#953](https://github.com/RVC-Boss/GPT-SoVITS/pull/953)
- 内容: SoVITSトレーニング時のVQ凍結漏れ問題を修正品質劣化の原因となる
- タイプ: 修正
- 貢献者: hcwu1993
- 関連: [Issue#747](https://github.com/RVC-Boss/GPT-SoVITS/issues/747)
- 2024.05.19 [PR#1102](https://github.com/RVC-Boss/GPT-SoVITS/pull/1102)
- 内容: トレーニングデータ処理時、未対応言語に対するエラープロンプトを追加
- タイプ: 機能改善
- 貢献者: StaryLan
- 2024.05.27 [PR#1132](https://github.com/RVC-Boss/GPT-SoVITS/pull/1132)
- 内容: Hubert抽出におけるバグ修正
- タイプ: 修正
- 貢献者: XXXXRT666
## 202406
- 2024.06.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/99f09c8bdc155c1f4272b511940717705509582a)
- 内容: WebUIのGPTファインチューニング時に中国語入力テキストのBERT特徴量を読み取れない問題を修正推論時との不一致や品質劣化の原因となる
**注意: 既に大量データでファインチューニング済みの場合は、品質向上のためモデルの再チューニングを推奨**
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.06.07 [PR#1159](https://github.com/RVC-Boss/GPT-SoVITS/pull/1159)
- 内容: `s2_train.py`におけるSoVITSトレーニングの進捗バー処理を修正
- タイプ: 修正
- 貢献者: pengzhendong
- 2024.06.10 [Commit#501a74ae](https://github.com/RVC-Boss/GPT-SoVITS/commit/501a74ae96789a26b48932babed5eb4e9483a232)
- 内容: UVR5 MDXNetがFFmpegを呼び出す際の文字列フォーマットを修正スペースを含むパスに対応
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.06.10 [PR#1168](https://github.com/RVC-Boss/GPT-SoVITS/pull/1168), [PR#1169](https://github.com/RVC-Boss/GPT-SoVITS/pull/1169)
- 内容: 純粋な句読点および複数句読点テキスト入力の処理ロジックを改善
- タイプ: 修正
- 貢献者: XXXXRT666
- 関連: [Issue#1165](https://github.com/RVC-Boss/GPT-SoVITS/issues/1165)
- 2024.06.13 [Commit#db506705](https://github.com/RVC-Boss/GPT-SoVITS/commit/db50670598f0236613eefa6f2d5a23a271d82041)
- 内容: CPU推論におけるデフォルトバッチサイズの小数点問題を修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.06.28 [PR#1258](https://github.com/RVC-Boss/GPT-SoVITS/pull/1258), [PR#1265](https://github.com/RVC-Boss/GPT-SoVITS/pull/1265), [PR#1267](https://github.com/RVC-Boss/GPT-SoVITS/pull/1267)
- 内容: イズ除去やASRで例外が発生した場合に保留中の全オーディオファイル処理が終了してしまう問題を修正
- タイプ: 修正
- 貢献者: XXXXRT666
- 2024.06.29 [Commit#a208698e](https://github.com/RVC-Boss/GPT-SoVITS/commit/a208698e775155efc95b187b746d153d0f2847ca)
- 内容: マルチGPUトレーニング時のマルチプロセス保存ロジックを修正
- タイプ: 修正
- 貢献者: RVC-Boss
- 2024.06.29 [PR#1251](https://github.com/RVC-Boss/GPT-SoVITS/pull/1251)
- 内容: 冗長な`my_utils.py`を削除
- タイプ: 最適化
- 貢献者: aoguai
- 関連: [Issue#1189](https://github.com/RVC-Boss/GPT-SoVITS/issues/1189)
## 202407
- 2024.07.06 [PR#1253](https://github.com/RVC-Boss/GPT-SoVITS/pull/1253)
- 内容: 句読点分割時の小数点分割問題を修正
- タイプ: 修正
- 貢献者: aoguai
- 2024.07.06 [Commit#](https://github.com/RVC-Boss/GPT-SoVITS/commit/b0786f2998f1b2fce6678434524b4e0e8cc716f5)
- 内容: 高速化推論コードが検証済みでmainブランチにマージされ、ベースと同等の推論効果を保証。テキスト未参照モードでも高速推論をサポート
- タイプ: 最適化
- 貢献者: RVC-Boss, GoHomeToMacDonal
- 関連: [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- 今後も`fast_inference`ブランチでの変更整合性を継続検証
- 2024.07.13 [PR#1294](https://github.com/RVC-Boss/GPT-SoVITS/pull/1294), [PR#1298](https://github.com/RVC-Boss/GPT-SoVITS/pull/1298)
- 内容: i18nスキャンのリファクタリングと多言語設定ファイルの更新
- タイプ: ドキュメンテーション
- 貢献者: StaryLan
- 2024.07.13 [PR#1299](https://github.com/RVC-Boss/GPT-SoVITS/pull/1299)
- 内容: ユーザーファイルパスの末尾スラッシュがコマンドラインエラーを引き起こす問題を修正
- タイプ: 修正
- 貢献者: XXXXRT666
- 2024.07.19 [PR#756](https://github.com/RVC-Boss/GPT-SoVITS/pull/756)
- 内容: GPTトレーニング時、カスタム`bucket_sampler`使用時のステップ数不一致を修正
- タイプ: 修正
- 貢献者: huangxu1991
- 2024.07.23 [Commit#9588a3c5](https://github.com/RVC-Boss/GPT-SoVITS/commit/9588a3c52d9ebdb20b3c5d74f647d12e7c1171c2), [PR#1340](https://github.com/RVC-Boss/GPT-SoVITS/pull/1340)
- 内容: 合成時の話速調整をサポート(ランダム性を固定して速度のみ制御するオプション含む)。`api.py`に更新済み
- タイプ: 新機能
- 貢献者: RVC-Boss, 红血球AE3803
- 2024.07.27 [PR#1306](https://github.com/RVC-Boss/GPT-SoVITS/pull/1306), [PR#1356](https://github.com/RVC-Boss/GPT-SoVITS/pull/1356)
- 内容: BS-RoFormerボーカル・伴奏分離モデルのサポートを追加。
- タイプ: 新機能
- 貢献者: KamioRinn
- 2024.07.27 [PR#1351](https://github.com/RVC-Boss/GPT-SoVITS/pull/1351)
- 内容: 中国語テキストフロントエンドの改善。
- タイプ: 新機能
- 貢献者: KamioRinn
## 202408 (V2 バージョン)
- 2024.08.01 [PR#1355](https://github.com/RVC-Boss/GPT-SoVITS/pull/1355)
- 内容: WebUIでファイル処理時にパスを自動入力するように変更。
- タイプ: 雑務
- 貢献者: XXXXRT666
- 2024.08.01 [Commit#e62e9653](https://github.com/RVC-Boss/GPT-SoVITS/commit/e62e965323a60a76a025bcaa45268c1ddcbcf05c)
- 内容: BS-RoformerのFP16推論サポートを有効化。
- タイプ: パフォーマンス最適化
- 貢献者: RVC-Boss
- 2024.08.01 [Commit#bce451a2](https://github.com/RVC-Boss/GPT-SoVITS/commit/bce451a2d1641e581e200297d01f219aeaaf7299), [Commit#4c8b7612](https://github.com/RVC-Boss/GPT-SoVITS/commit/4c8b7612206536b8b4435997acb69b25d93acb78)
- 内容: GPU認識ロジックを最適化、ユーザーが入力した任意のGPUインデックスを処理するユーザーフレンドリーなロジックを追加。
- タイプ: 雑務
- 貢献者: RVC-Boss
- 2024.08.02 [Commit#ff6c193f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff6c193f6fb99d44eea3648d82ebcee895860a22)~[Commit#de7ee7c7](https://github.com/RVC-Boss/GPT-SoVITS/commit/de7ee7c7c15a2ec137feb0693b4ff3db61fad758)
- 内容: **GPT-SoVITS V2モデルを追加。**
- タイプ: 新機能
- 貢献者: RVC-Boss
- 2024.08.03 [Commit#8a101474](https://github.com/RVC-Boss/GPT-SoVITS/commit/8a101474b5a4f913b4c94fca2e3ca87d0771bae3)
- 内容: FunASRを使用して広東語ASRをサポート。
- タイプ: 新機能
- 貢献者: RVC-Boss
- 2024.08.03 [PR#1387](https://github.com/RVC-Boss/GPT-SoVITS/pull/1387), [PR#1388](https://github.com/RVC-Boss/GPT-SoVITS/pull/1388)
- 内容: UIとタイミングロジックを最適化。
- タイプ: 雑務
- 貢献者: XXXXRT666
- 2024.08.06 [PR#1404](https://github.com/RVC-Boss/GPT-SoVITS/pull/1404), [PR#987](https://github.com/RVC-Boss/GPT-SoVITS/pull/987), [PR#488](https://github.com/RVC-Boss/GPT-SoVITS/pull/488)
- 内容: 多音字処理ロジックを最適化V2のみ
- タイプ: 修正、新機能
- 貢献者: KamioRinn、RVC-Boss
- 2024.08.13 [PR#1422](https://github.com/RVC-Boss/GPT-SoVITS/pull/1422)
- 内容: 参照音声が1つしかアップロードできないバグを修正。欠損ファイルがある場合に警告ポップアップを表示するデータセット検証を追加。
- タイプ: 修正、雑務
- 貢献者: XXXXRT666
- 2024.08.20 [Issue#1508](https://github.com/RVC-Boss/GPT-SoVITS/issues/1508)
- 内容: 上流のLangSegmentライブラリがSSMLタグを使用した数字、電話番号、日付、時刻の最適化をサポート。
- タイプ: 新機能
- 貢献者: juntaosun
- 2024.08.20 [PR#1503](https://github.com/RVC-Boss/GPT-SoVITS/pull/1503)
- 内容: APIを修正・最適化。
- タイプ: 修正
- 貢献者: KamioRinn
- 2024.08.20 [PR#1490](https://github.com/RVC-Boss/GPT-SoVITS/pull/1490)
- 内容: `fast_inference`ブランチをメインブランチにマージ。
- タイプ: リファクタリング
- 貢献者: ChasonJiang
- 2024.08.21 **GPT-SoVITS V2バージョンを正式リリース。**
## 202502 (V3 バージョン)
- 2025.02.11 [Commit#ed207c4b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ed207c4b879d5296e9be3ae5f7b876729a2c43b8)~[Commit#6e2b4918](https://github.com/RVC-Boss/GPT-SoVITS/commit/6e2b49186c5b961f0de41ea485d398dffa9787b4)
- 内容: **GPT-SoVITS V3モデルを追加。ファインチューニングには14GBのVRAMが必要。**
- タイプ: 新機能([Wiki](https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7))参照)
- 貢献者: RVC-Boss
- 2025.02.12 [PR#2032](https://github.com/RVC-Boss/GPT-SoVITS/pull/2032)
- 内容: 多言語プロジェクトドキュメントを更新。
- タイプ: ドキュメント
- 貢献者: StaryLan
- 2025.02.12 [PR#2033](https://github.com/RVC-Boss/GPT-SoVITS/pull/2033)
- 内容: 日本語ドキュメントを更新。
- タイプ: ドキュメント
- 貢献者: Fyphen
- 2025.02.12 [PR#2010](https://github.com/RVC-Boss/GPT-SoVITS/pull/2010)
- 内容: アテンション計算ロジックを最適化。
- タイプ: パフォーマンス最適化
- 貢献者: wzy3650
- 2025.02.12 [PR#2040](https://github.com/RVC-Boss/GPT-SoVITS/pull/2040)
- 内容: ファインチューニング用に勾配チェックポイントサポートを追加。12GB VRAMが必要。
- タイプ: 新機能
- 貢献者: Kakaru Hayate
- 2025.02.14 [PR#2047](https://github.com/RVC-Boss/GPT-SoVITS/pull/2047), [PR#2062](https://github.com/RVC-Boss/GPT-SoVITS/pull/2062), [PR#2073](https://github.com/RVC-Boss/GPT-SoVITS/pull/2073)
- 内容: 新しい言語セグメンテーションツールに切り替え、多言語混合テキストの分割戦略を改善。数字と英語の処理ロジックを最適化。
- タイプ: 新機能
- 貢献者: KamioRinn
- 2025.02.23 [Commit#56509a17](https://github.com/RVC-Boss/GPT-SoVITS/commit/56509a17c918c8d149c48413a672b8ddf437495b)~[Commit#514fb692](https://github.com/RVC-Boss/GPT-SoVITS/commit/514fb692db056a06ed012bc3a5bca2a5b455703e)
- 内容: **GPT-SoVITS V3モデルがLoRAトレーニングをサポート。ファインチューニングに8GB GPUメモリが必要。**
- タイプ: 新機能
- 貢献者: RVC-Boss
- 2025.02.23 [PR#2078](https://github.com/RVC-Boss/GPT-SoVITS/pull/2078)
- 内容: ボーカルと楽器分離のためのMel Band Roformerモデルサポートを追加。
- タイプ: 新機能
- 貢献者: Sucial
- 2025.02.26 [PR#2112](https://github.com/RVC-Boss/GPT-SoVITS/pull/2112), [PR#2114](https://github.com/RVC-Boss/GPT-SoVITS/pull/2114)
- 内容: 中国語パス下でのMeCabエラーを修正日本語/韓国語または多言語テキスト分割用)。
- タイプ: 修正
- 貢献者: KamioRinn
- 2025.02.27 [Commit#92961c3f](https://github.com/RVC-Boss/GPT-SoVITS/commit/92961c3f68b96009ff2cd00ce614a11b6c4d026f)~[Commit#250b1c73](https://github.com/RVC-Boss/GPT-SoVITS/commit/250b1c73cba60db18148b21ec5fbce01fd9d19bc)
- 内容: **24kHzから48kHzへのオーディオ超解像モデルを追加**。V3モデルで24Kオーディオを生成する際の「こもった」オーディオ問題を緩和。
- タイプ: 新機能
- 貢献者: RVC-Boss
- 関連: [Issue#2085](https://github.com/RVC-Boss/GPT-SoVITS/issues/2085), [Issue#2117](https://github.com/RVC-Boss/GPT-SoVITS/issues/2117)
- 2025.02.28 [PR#2123](https://github.com/RVC-Boss/GPT-SoVITS/pull/2123)
- 内容: 多言語プロジェクトドキュメントを更新。
- タイプ: ドキュメント
- 貢献者: StaryLan
- 2025.02.28 [PR#2122](https://github.com/RVC-Boss/GPT-SoVITS/pull/2122)
- 内容: モデルが識別できない短いCJK文字に対してルールベースの検出を適用。
- タイプ: 修正
- 貢献者: KamioRinn
- 関連: [Issue#2116](https://github.com/RVC-Boss/GPT-SoVITS/issues/2116)
- 2025.02.28 [Commit#c38b1690](https://github.com/RVC-Boss/GPT-SoVITS/commit/c38b16901978c1db79491e16905ea3a37a7cf686), [Commit#a32a2b89](https://github.com/RVC-Boss/GPT-SoVITS/commit/a32a2b893436fad56cc82409121c7fa36a1815d5)
- 内容: 合成速度を制御するための発話速度パラメータを追加。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2025.02.28 **GPT-SoVITS V3を正式リリース**
## 202503
- 2025.03.31 [PR#2236](https://github.com/RVC-Boss/GPT-SoVITS/pull/2236)
- 内容: 依存関係の不正なバージョンによる問題を修正。
- タイプ: 修正
- 貢献者: XXXXRT666
- 関連:
- PyOpenJTalk: [Issue#1131](https://github.com/RVC-Boss/GPT-SoVITS/issues/1131), [Issue#2231](https://github.com/RVC-Boss/GPT-SoVITS/issues/2231), [Issue#2233](https://github.com/RVC-Boss/GPT-SoVITS/issues/2233).
- ONNX: [Issue#492](https://github.com/RVC-Boss/GPT-SoVITS/issues/492), [Issue#671](https://github.com/RVC-Boss/GPT-SoVITS/issues/671), [Issue#1192](https://github.com/RVC-Boss/GPT-SoVITS/issues/1192), [Issue#1819](https://github.com/RVC-Boss/GPT-SoVITS/issues/1819), [Issue#1841](https://github.com/RVC-Boss/GPT-SoVITS/issues/1841).
- Pydantic: [Issue#2230](https://github.com/RVC-Boss/GPT-SoVITS/issues/2230), [Issue#2239](https://github.com/RVC-Boss/GPT-SoVITS/issues/2239).
- PyTorch-Lightning: [Issue#2174](https://github.com/RVC-Boss/GPT-SoVITS/issues/2174).
- 2025.03.31 [PR#2241](https://github.com/RVC-Boss/GPT-SoVITS/pull/2241)
- 内容: **SoVITS v3の並列推論を有効化。**
- タイプ: 新機能
- 貢献者: ChasonJiang
- その他の軽微なバグを修正。
- ONNXランタイムGPU推論サポートのための統合パッケージ修正:
- タイプ: 修正
- 詳細:
- G2PW内のONNXモデルをCPUからGPU推論に切り替え、CPUボトルネックを大幅に削減;
- foxjoy dereverberationモデルがGPU推論をサポート。
## 202504 (V4 バージョン)
- 2025.04.01 [Commit#6a60e5ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/6a60e5edb1817af4a61c7a5b196c0d0f1407668f)
- 内容: SoVITS v3並列推論のロックを解除。非同期モデル読み込みロジックを修正。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2025.04.07 [PR#2255](https://github.com/RVC-Boss/GPT-SoVITS/pull/2255)
- 内容: Ruffを使用したコードフォーマット。G2PWリンクを更新。
- タイプ: スタイル
- 貢献者: XXXXRT666
- 2025.04.15 [PR#2290](https://github.com/RVC-Boss/GPT-SoVITS/pull/2290)
- 内容: ドキュメントを整理。Python 3.11サポートを追加。インストーラーを更新。
- タイプ: 雑務
- 貢献者: XXXXRT666
- 2025.04.20 [PR#2300](https://github.com/RVC-Boss/GPT-SoVITS/pull/2300)
- 内容: Colab、インストールファイル、モデルダウンロードを更新。
- タイプ: 雑務
- 貢献者: XXXXRT666
- 2025.04.20 [Commit#e0c452f0](https://github.com/RVC-Boss/GPT-SoVITS/commit/e0c452f0078e8f7eb560b79a54d75573fefa8355)~[Commit#9d481da6](https://github.com/RVC-Boss/GPT-SoVITS/commit/9d481da610aa4b0ef8abf5651fd62800d2b4e8bf)
- 内容: **GPT-SoVITS V4モデルを追加。**
- タイプ: 新機能
- 貢献者: RVC-Boss
- 2025.04.21 [Commit#8b394a15](https://github.com/RVC-Boss/GPT-SoVITS/commit/8b394a15bce8e1d85c0b11172442dbe7a6017ca2)~[Commit#bc2fe5ec](https://github.com/RVC-Boss/GPT-SoVITS/commit/bc2fe5ec86536c77bb3794b4be263ac87e4fdae6), [PR#2307](https://github.com/RVC-Boss/GPT-SoVITS/pull/2307)
- 内容: V4の並列推論を有効化。
- タイプ: 新機能
- 貢献者: RVC-Boss、ChasonJiang
- 2025.04.22 [Commit#7405427a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7405427a0ab2a43af63205df401fd6607a408d87)~[Commit#590c83d7](https://github.com/RVC-Boss/GPT-SoVITS/commit/590c83d7667c8d4908f5bdaf2f4c1ba8959d29ff), [PR#2309](https://github.com/RVC-Boss/GPT-SoVITS/pull/2309)
- 内容: モデルバージョンパラメータの受け渡しを修正。
- タイプ: 修正
- 貢献者: RVC-Boss、ChasonJiang
- 2025.04.22 [Commit#fbdab94e](https://github.com/RVC-Boss/GPT-SoVITS/commit/fbdab94e17d605d85841af6f94f40a45976dd1d9), [PR#2310](https://github.com/RVC-Boss/GPT-SoVITS/pull/2310)
- 内容: NumpyとNumbaのバージョン不一致問題を修正。librosaバージョンを更新。
- タイプ: 修正
- 貢献者: RVC-Boss、XXXXRT666
- 関連: [Issue#2308](https://github.com/RVC-Boss/GPT-SoVITS/issues/2308)
- **2024.04.22 GPT-SoVITS V4を正式リリース**。
- 2025.04.22 [PR#2311](https://github.com/RVC-Boss/GPT-SoVITS/pull/2311)
- 内容: Gradioパラメータを更新。
- タイプ: 雑務
- 貢献者: XXXXRT666
- 2025.04.25 [PR#2322](https://github.com/RVC-Boss/GPT-SoVITS/pull/2322)
- 内容: Colab/Kaggleートブックスクリプトを改善。
- タイプ: 雑務
- 貢献者: XXXXRT666
## 202505
- 2025.05.26 [PR#2351](https://github.com/RVC-Boss/GPT-SoVITS/pull/2351)
- 内容: DockerとWindows自動ビルドスクリプトを改善。pre-commitフォーマットを追加。
- タイプ: 雑務
- 貢献者: XXXXRT666
- 2025.05.26 [PR#2408](https://github.com/RVC-Boss/GPT-SoVITS/pull/2408)
- 内容: 多言語テキスト分割と認識ロジックを最適化。
- タイプ: 修正
- 貢献者: KamioRinn
- 関連: [Issue#2404](https://github.com/RVC-Boss/GPT-SoVITS/issues/2404)
- 2025.05.26 [PR#2377](https://github.com/RVC-Boss/GPT-SoVITS/pull/2377)
- 内容: キャッシュ戦略を実装し、SoVITS V3/V4推論速度を10%向上。
- タイプ: パフォーマンス最適化
- 貢献者: Kakaru Hayate
- 2025.05.26 [Commit#4d9d56b1](https://github.com/RVC-Boss/GPT-SoVITS/commit/4d9d56b19638dc434d6eefd9545e4d8639a3e072), [Commit#8c705784](https://github.com/RVC-Boss/GPT-SoVITS/commit/8c705784c50bf438c7b6d0be33a9e5e3cb90e6b2), [Commit#fafe4e7f](https://github.com/RVC-Boss/GPT-SoVITS/commit/fafe4e7f120fba56c5f053c6db30aa675d5951ba)
- 内容: アテーションインターフェースを更新し、以下の注意事項を追加しました各ページの編集が終わったら必ず「Submit Text」をクリックしてください。さもなくば変更は保存されません。
- タイプ: 修正
- 貢献者: RVC-Boss
- 2025.05.29 [Commit#1934fc1e](https://github.com/RVC-Boss/GPT-SoVITS/commit/1934fc1e1b22c4c162bba1bbe7d7ebb132944cdc)
- 内容: UVR5およびONNX dereverberationモデルのエラーを修正。FFmpegが元のパスにスペースを含むMP3/M4Aファイルをエンコードする場合の問題を解決。
- タイプ: 修正
- 貢献者: RVC-Boss
## 202506V2Pro シリーズ)
- 2025.06.03 [PR#2420](https://github.com/RVC-Boss/GPT-SoVITS/pull/2420)
- 内容: プロジェクトの多言語ドキュメントを更新
- タイプ: ドキュメント
- 貢献者: StaryLan
- 2025.06.04 [PR#2417](https://github.com/RVC-Boss/GPT-SoVITS/pull/2417)
- 内容: TorchScript を使用した V4 モデルのエクスポート機能を追加
- タイプ: 新機能
- 貢献者: L-jasmine
- 2025.06.04 [Commit#b7c0c5ca](https://github.com/RVC-Boss/GPT-SoVITS/commit/b7c0c5ca878bcdd419fd86bf80dba431a6653356)〜[Commit#298ebb03](https://github.com/RVC-Boss/GPT-SoVITS/commit/298ebb03c5a719388527ae6a586c7ea960344e70)
- 内容: GPT-SoVITS V2Pro シリーズモデル (V2Pro, V2ProPlus) を正式に導入
- タイプ: 新機能
- 貢献者: RVC-Boss
- 2025.06.05 [PR#2426](https://github.com/RVC-Boss/GPT-SoVITS/pull/2426)
- 内容: `config/inference_webui` の初期化時のエラーを修正
- タイプ: 不具合修正Bug Fix
- 貢献者: StaryLan
- 2025.06.05 [PR#2427](https://github.com/RVC-Boss/GPT-SoVITS/pull/2427), [Commit#7d70852a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7d70852a3f67c3b52e3a62857f8663d529efc8cd), [PR#2434](https://github.com/RVC-Boss/GPT-SoVITS/pull/2434)
- 内容: 自動精度検出ロジックを最適化し、WebUI フロントエンドモジュールに折り畳みCollapsible機能を追加
- タイプ: 新機能
- 貢献者: XXXXRT666, RVC-Boss

View File

@@ -0,0 +1,452 @@
<div align="center">
<h1>GPT-SoVITS-WebUI</h1>
パワフルなFew-Shot音声変換・音声合成 WebUI.<br><br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/RVC-Boss/GPT-SoVITS)
<a href="https://trendshift.io/repositories/7033" target="_blank"><img src="https://trendshift.io/api/badge/repositories/7033" alt="RVC-Boss%2FGPT-SoVITS | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
[![Python](https://img.shields.io/badge/python-3.10--3.12-blue?style=for-the-badge&logo=python)](https://www.python.org)
[![GitHub release](https://img.shields.io/github/v/release/RVC-Boss/gpt-sovits?style=for-the-badge&logo=github)](https://github.com/RVC-Boss/gpt-sovits/releases)
[![Train In Colab](https://img.shields.io/badge/Colab-Training-F9AB00?style=for-the-badge&logo=googlecolab)](https://colab.research.google.com/github/RVC-Boss/GPT-SoVITS/blob/main/Colab-WebUI.ipynb)
[![Huggingface](https://img.shields.io/badge/免费在线体验-free_online_demo-yellow.svg?style=for-the-badge&logo=huggingface)](https://lj1995-gpt-sovits-proplus.hf.space/)
[![Image Size](https://img.shields.io/docker/image-size/xxxxrt666/gpt-sovits/latest?style=for-the-badge&logo=docker)](https://hub.docker.com/r/xxxxrt666/gpt-sovits)
[![简体中文](https://img.shields.io/badge/简体中文-阅读文档-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e)
[![English](https://img.shields.io/badge/English-Read%20Docs-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://rentry.co/GPT-SoVITS-guide#/)
[![Change Log](https://img.shields.io/badge/Change%20Log-View%20Updates-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/docs/en/Changelog_EN.md)
[![License](https://img.shields.io/badge/LICENSE-MIT-green.svg?style=for-the-badge&logo=opensourceinitiative)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/LICENSE)
[**English**](../../README.md) | [**中文简体**](../cn/README.md) | **日本語** | [**한국어**](../ko/README.md) | [**Türkçe**](../tr/README.md)
</div>
---
## 機能:
1. **Zero-Shot TTS:** たった 5 秒間の音声サンプルで、即座にテキストからその音声に変換できます.
2. **Few-Shot TTS:** わずか 1 分間のトレーニングデータでモデルを微調整し、音声のクオリティを向上.
3. **多言語サポート:** 現在、英語、日本語、韓国語、広東語、中国語をサポートしています.
4. **WebUI ツール:** 統合されたツールは、音声と伴奏 (BGM 等) の分離、トレーニングセットの自動セグメンテーション、ASR (中国語のみ)、テキストラベリング等を含むため、初心者の方でもトレーニングデータセットの作成や GPT/SoVITS モデルのトレーニング等を非常に簡単に行えます.
**[デモ動画](https://www.bilibili.com/video/BV12g4y1m7Uw)をチェック!**
声の事前学習無しかつ Few-Shot でトレーニングされたモデルのデモ:
https://github.com/RVC-Boss/GPT-SoVITS/assets/129054828/05bee1fa-bdd8-4d85-9350-80c060ab47fb
**ユーザーマニュアル: [简体中文](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e) | [English](https://rentry.co/GPT-SoVITS-guide#/)**
## インストール
### テスト済みの環境
| Python Version | PyTorch Version | Device |
| -------------- | ---------------- | ------------- |
| Python 3.10 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.7.0 | CUDA 12.8 |
| Python 3.9 | PyTorch 2.8.0dev | CUDA 12.8 |
| Python 3.9 | PyTorch 2.5.1 | Apple silicon |
| Python 3.11 | PyTorch 2.7.0 | Apple silicon |
| Python 3.9 | PyTorch 2.2.2 | CPU |
### Windows
Windows ユーザー: (Windows 10 以降でテスト済み)、[統合パッケージをダウンロード](https://huggingface.co/lj1995/GPT-SoVITS-windows-package/resolve/main/GPT-SoVITS-v3lora-20250228.7z?download=true)し、解凍後に _go-webui.bat_ をダブルクリックすると、GPT-SoVITS-WebUI が起動します.
### Linux
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <CU126|CU128|ROCM|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### macOS
**注: Mac で GPU を使用して訓練されたモデルは、他のデバイスで訓練されたモデルと比較して著しく品質が低下するため、当面は CPU を使用して訓練することを強く推奨します.**
以下のコマンドを実行してこのプロジェクトをインストールします:
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <MPS|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### 手動インストール
#### 依存関係をインストールします
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pip install -r extra-req.txt --no-deps
pip install -r requirements.txt
```
#### FFmpeg をインストールします
##### Conda ユーザー
```bash
conda activate GPTSoVits
conda install ffmpeg
```
##### Ubuntu/Debian ユーザー
```bash
sudo apt install ffmpeg
sudo apt install libsox-dev
```
##### Windows ユーザー
[ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe) と [ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe) をダウンロードし、GPT-SoVITS のルートフォルダに置きます
[Visual Studio 2017](https://aka.ms/vs/17/release/vc_redist.x86.exe) 環境をインストールしてください
##### MacOS ユーザー
```bash
brew install ffmpeg
```
### GPT-SoVITS の実行 (Docker 使用)
#### Docker イメージの選択
コードベースの更新が頻繁である一方、Docker イメージのリリースは比較的遅いため、以下を確認してください:
- [Docker Hub](https://hub.docker.com/r/xxxxrt666/gpt-sovits) で最新のイメージタグを確認してください
- 環境に合った適切なイメージタグを選択してください
- `Lite` とは、Docker イメージに ASR モデルおよび UVR5 モデルが**含まれていない**ことを意味します. UVR5 モデルは手動でダウンロードし、ASR モデルは必要に応じてプログラムが自動的にダウンロードします
- Docker Compose 実行時に、対応するアーキテクチャ (amd64 または arm64) のイメージが自動的に取得されます
- Docker Compose は現在のディレクトリ内の**すべてのファイル**をマウントします. Docker イメージを使用する前に、プロジェクトのルートディレクトリに移動し、**コードを最新の状態に更新**してください
- オプション:最新の変更を反映させるため、提供されている Dockerfile を使ってローカルでイメージをビルドすることも可能です
#### 環境変数
- `is_half`:半精度 (fp16) を使用するかどうかを制御します. GPU が対応している場合、`true` に設定することでメモリ使用量を削減できます
#### 共有メモリの設定
Windows (Docker Desktop) では、デフォルトの共有メモリサイズが小さいため、予期しない動作が発生する可能性があります. Docker Compose ファイル内の `shm_size` を (例:`16g`) に増やすことをおすすめします
#### サービスの選択
`docker-compose.yaml` ファイルには次の 2 種類のサービスが定義されています:
- `GPT-SoVITS-CU126` および `GPT-SoVITS-CU128`:すべての機能を含むフルバージョン
- `GPT-SoVITS-CU126-Lite` および `GPT-SoVITS-CU128-Lite`:依存関係を削減した軽量バージョン
特定のサービスを Docker Compose で実行するには、以下のコマンドを使用します:
```bash
docker compose run --service-ports <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128>
```
#### Docker イメージのローカルビルド
自分でイメージをビルドするには、以下のコマンドを使ってください:
```bash
bash docker_build.sh --cuda <12.6|12.8> [--lite]
```
#### 実行中のコンテナへアクセス (Bash Shell)
コンテナがバックグラウンドで実行されている場合、以下のコマンドでシェルにアクセスできます:
```bash
docker exec -it <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128> bash
```
## 事前訓練済みモデル
**`install.sh`が正常に実行された場合、No.1,2,3 はスキップしてかまいません.**
1. [GPT-SoVITS Models](https://huggingface.co/lj1995/GPT-SoVITS) から事前訓練済みモデルをダウンロードし、`GPT_SoVITS/pretrained_models` ディレクトリに配置してください.
2. [G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) からモデルをダウンロードし、解凍して `G2PWModel` にリネームし、`GPT_SoVITS/text` ディレクトリに配置してください. (中国語 TTS のみ)
3. UVR5 (ボーカル/伴奏 (BGM 等) 分離 & リバーブ除去の追加機能) の場合は、[UVR5 Weights](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/uvr5_weights) からモデルをダウンロードし、`tools/uvr5/uvr5_weights` ディレクトリに配置してください.
- UVR5 で bs_roformer または mel_band_roformer モデルを使用する場合、モデルと対応する設定ファイルを手動でダウンロードし、`tools/UVR5/UVR5_weights`フォルダに配置することができます.**モデルファイルと設定ファイルの名前は、拡張子を除いて同じであることを確認してください**.さらに、モデルと設定ファイルの名前には**「roformer」が含まれている必要があります**.これにより、roformer クラスのモデルとして認識されます.
- モデル名と設定ファイル名には、**直接モデルタイプを指定することをお勧めします**.例: mel_mand_roformer、bs_roformer.指定しない場合、設定文から特徴を照合して、モデルの種類を特定します.例えば、モデル`bs_roformer_ep_368_sdr_12.9628.ckpt`と対応する設定ファイル`bs_roformer_ep_368_sdr_12.9628.yaml`はペアです.同様に、`kim_mel_band_roformer.ckpt``kim_mel_band_roformer.yaml`もペアです.
4. 中国語 ASR (追加機能) の場合は、[Damo ASR Model](https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/files)、[Damo VAD Model](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/files)、および [Damo Punc Model](https://modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/files) からモデルをダウンロードし、`tools/asr/models` ディレクトリに配置してください.
5. 英語または日本語の ASR (追加機能) を使用する場合は、[Faster Whisper Large V3](https://huggingface.co/Systran/faster-whisper-large-v3) からモデルをダウンロードし、`tools/asr/models` ディレクトリに配置してください.また、[他のモデル](https://huggingface.co/Systran) は、より小さいサイズで高クオリティな可能性があります.
## データセット形式
TTS アノテーション .list ファイル形式:
```
vocal_path|speaker_name|language|text
```
言語辞書:
- 'zh': 中国語
- 'ja': 日本語
- 'en': 英語
例:
```
D:\GPT-SoVITS\xxx/xxx.wav|xxx|en|I like playing Genshin.
```
## 微調整と推論
### WebUI を開く
#### 統合パッケージ利用者
`go-webui.bat`をダブルクリックするか、`go-webui.ps1`を使用します.
V1 に切り替えたい場合は、`go-webui-v1.bat`をダブルクリックするか、`go-webui-v1.ps1`を使用してください.
#### その他
```bash
python webui.py <言語(オプション)>
```
V1 に切り替えたい場合は
```bash
python webui.py v1 <言語(オプション)>
```
または WebUI で手動でバージョンを切り替えてください.
### 微調整
#### パス自動補完のサポート
1. 音声パスを入力する
2. 音声を小さなチャンクに分割する
3. ノイズ除去 (オプション)
4. ASR
5. ASR 転写を校正する
6. 次のタブに移動し、モデルを微調整する
### 推論 WebUI を開く
#### 統合パッケージ利用者
`go-webui-v2.bat`をダブルクリックするか、`go-webui-v2.ps1`を使用して、`1-GPT-SoVITS-TTS/1C-inference`で推論 webui を開きます.
#### その他
```bash
python GPT_SoVITS/inference_webui.py <言語(オプション)>
```
または
```bash
python webui.py
```
その後、`1-GPT-SoVITS-TTS/1C-inference`で推論 webui を開きます.
## V2 リリースノート
新機能:
1. 韓国語と広東語をサポート
2. 最適化されたテキストフロントエンド
3. 事前学習済みモデルが 2 千時間から 5 千時間に拡張
4. 低品質の参照音声に対する合成品質の向上
[詳細はこちら](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v2%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
V1 環境から V2 を使用するには:
1. `pip install -r requirements.txt`を使用していくつかのパッケージを更新
2. 最新のコードを github からクローン
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main/gsv-v2final-pretrained)から V2 の事前学習モデルをダウンロードし、それらを`GPT_SoVITS/pretrained_models/gsv-v2final-pretrained`に配置
中国語 V2 追加: [G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) (G2PW モデルをダウンロードし、解凍して`G2PWModel`にリネームし、`GPT_SoVITS/text`に配置します)
## V3 リリースノート
新機能:
1. 音色の類似性が向上し、ターゲットスピーカーを近似するために必要な学習データが少なくなりました (音色の類似性は、ファインチューニングなしでベースモデルを直接使用することで顕著に改善されます).
2. GPT モデルがより安定し、繰り返しや省略が減少し、より豊かな感情表現を持つ音声の生成が容易になりました.
[詳細情報はこちら](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
v2 環境から v3 を使用する方法:
1. `pip install -r requirements.txt` を実行して、いくつかのパッケージを更新します.
2. GitHub から最新のコードをクローンします.
3. v3 の事前学習済みモデル (s1v3.ckpt、s2Gv3.pth、models--nvidia--bigvgan_v2_24khz_100band_256x フォルダ) を[Huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) からダウンロードし、GPT_SoVITS/pretrained_models フォルダに配置します.
追加: 音声超解像モデルについては、[ダウンロード方法](../../tools/AP_BWE_main/24kto48k/readme.txt)を参照してください.
## V4 リリースノート
新機能:
1. **V4 は、V3 で発生していた非整数倍アップサンプリングによる金属音の問題を修正し、音声がこもる問題を防ぐためにネイティブに 48kHz 音声を出力しますV3 はネイティブに 24kHz 音声のみ出力)**. 作者は V4 を V3 の直接的な置き換えとして推奨していますが、さらなるテストが必要です.
[詳細はこちら](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3v4%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
V1/V2/V3 環境から V4 への移行方法:
1. `pip install -r requirements.txt` を実行して一部の依存パッケージを更新してください.
2. GitHub から最新のコードをクローンします.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) から V4 の事前学習済みモデル (`gsv-v4-pretrained/s2v4.ckpt` および `gsv-v4-pretrained/vocoder.pth`) をダウンロードし、`GPT_SoVITS/pretrained_models` ディレクトリへ配置してください.
## V2Pro リリースノート
新機能:
1. **V2 と比較してやや高いメモリ使用量ですが、ハードウェアコストと推論速度は維持しつつ、V4 よりも高い性能と音質を実現します. **
[詳細はこちら](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90features-(%E5%90%84%E7%89%88%E6%9C%AC%E7%89%B9%E6%80%A7)>)
2. V1/V2 と V2Pro シリーズは類似した特徴を持ち、V3/V4 も同様の機能を持っています. 平均音質が低いトレーニングセットの場合、V1/V2/V2Pro は良好な結果を出すことができますが、V3/V4 では対応できません. また、V3/V4 の合成音声はトレーニング全体ではなく、より参考音声に寄った音質になります.
V1/V2/V3/V4 環境から V2Pro への移行方法:
1. `pip install -r requirements.txt` を実行して一部の依存パッケージを更新してください.
2. GitHub から最新のコードをクローンします.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) から V2Pro の事前学習済みモデル (`v2Pro/s2Dv2Pro.pth`, `v2Pro/s2Gv2Pro.pth`, `v2Pro/s2Dv2ProPlus.pth`, `v2Pro/s2Gv2ProPlus.pth`, および `sv/pretrained_eres2netv2w24s4ep4.ckpt`) をダウンロードし、`GPT_SoVITS/pretrained_models` ディレクトリへ配置してください.
## Todo リスト
- [x] **優先度 高:**
- [x] 日本語と英語でのローカライズ.
- [x] ユーザーガイド.
- [x] 日本語データセットと英語データセットのファインチューニングトレーニング.
- [ ] **機能:**
- [x] ゼロショット音声変換 (5 秒) /数ショット音声変換 (1 分).
- [x] TTS スピーキングスピードコントロール.
- [ ] ~~TTS の感情コントロールの強化.~~
- [ ] SoVITS トークン入力を語彙の確率分布に変更する実験.
- [x] 英語と日本語のテキストフロントエンドを改善.
- [ ] 小型と大型の TTS モデルを開発する.
- [x] Colab のスクリプト.
- [ ] トレーニングデータセットを拡張する (2k→10k).
- [x] より良い sovits ベースモデル (音質向上)
- [ ] モデルミックス
## (追加の) コマンドラインから実行する方法
コマンド ラインを使用して UVR5 の WebUI を開きます
```bash
python tools/uvr5/webui.py "<infer_device>" <is_half> <webui_port_uvr5>
```
<!-- ブラウザを開けない場合は、以下の形式に従って UVR 処理を行ってください.これはオーディオ処理に mdxnet を使用しています.
```
python mdxnet.py --model --input_root --output_vocal --output_ins --agg_level --format --device --is_half_precision
``` -->
コマンド ラインを使用してデータセットのオーディオ セグメンテーションを行う方法は次のとおりです.
```bash
python audio_slicer.py \
--input_path "<path_to_original_audio_file_or_directory>" \
--output_root "<directory_where_subdivided_audio_clips_will_be_saved>" \
--threshold <volume_threshold> \
--min_length <minimum_duration_of_each_subclip> \
--min_interval <shortest_time_gap_between_adjacent_subclips>
--hop_size <step_size_for_computing_volume_curve>
```
コマンドラインを使用してデータセット ASR 処理を行う方法です (中国語のみ)
```bash
python tools/asr/funasr_asr.py -i <input> -o <output>
```
ASR 処理は Faster_Whisper を通じて実行されます(中国語を除く ASR マーキング)
(進行状況バーは表示されません.GPU のパフォーマンスにより時間遅延が発生する可能性があります)
```bash
python ./tools/asr/fasterwhisper_asr.py -i <input> -o <output> -l <language> -p <precision>
```
カスタムリストの保存パスが有効になっています
## クレジット
特に以下のプロジェクトと貢献者に感謝します:
### 理論研究
- [ar-vits](https://github.com/innnky/ar-vits)
- [SoundStorm](https://github.com/yangdongchao/SoundStorm/tree/master/soundstorm/s1/AR)
- [vits](https://github.com/jaywalnut310/vits)
- [TransferTTS](https://github.com/hcy71o/TransferTTS/blob/master/models.py#L556)
- [contentvec](https://github.com/auspicious3000/contentvec/)
- [hifi-gan](https://github.com/jik876/hifi-gan)
- [fish-speech](https://github.com/fishaudio/fish-speech/blob/main/tools/llama/generate.py#L41)
- [f5-TTS](https://github.com/SWivid/F5-TTS/blob/main/src/f5_tts/model/backbones/dit.py)
- [shortcut flow matching](https://github.com/kvfrans/shortcut-models/blob/main/targets_shortcut.py)
### 事前学習モデル
- [Chinese Speech Pretrain](https://github.com/TencentGameMate/chinese_speech_pretrain)
- [Chinese-Roberta-WWM-Ext-Large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large)
- [BigVGAN](https://github.com/NVIDIA/BigVGAN)
- [eresnetv2](https://modelscope.cn/models/iic/speech_eres2netv2w24s4ep4_sv_zh-cn_16k-common)
### 推論用テキストフロントエンド
- [paddlespeech zh_normalization](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/zh_normalization)
- [split-lang](https://github.com/DoodleBears/split-lang)
- [g2pW](https://github.com/GitYCC/g2pW)
- [pypinyin-g2pW](https://github.com/mozillazg/pypinyin-g2pW)
- [paddlespeech g2pw](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/g2pw)
### WebUI ツール
- [ultimatevocalremovergui](https://github.com/Anjok07/ultimatevocalremovergui)
- [audio-slicer](https://github.com/openvpi/audio-slicer)
- [SubFix](https://github.com/cronrpc/SubFix)
- [FFmpeg](https://github.com/FFmpeg/FFmpeg)
- [gradio](https://github.com/gradio-app/gradio)
- [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
- [FunASR](https://github.com/alibaba-damo-academy/FunASR)
- [AP-BWE](https://github.com/yxlu-0102/AP-BWE)
@Naozumi520 さん、広東語のトレーニングセットの提供と、広東語に関する知識のご指導をいただき、感謝申し上げます.
## すべてのコントリビューターに感謝します
<a href="https://github.com/RVC-Boss/GPT-SoVITS/graphs/contributors" target="_blank">
<img src="https://contrib.rocks/image?repo=RVC-Boss/GPT-SoVITS" />
</a>

View File

@@ -0,0 +1,580 @@
# 변경 내역
## 202401
- 2024.01.21 [PR#108](https://github.com/RVC-Boss/GPT-SoVITS/pull/108)
- 내용: WebUI에 영어 시스템 번역 지원 추가.
- 유형: 문서화
- 기여자: D3lik
- 2024.01.21 [Commit#7b89c9ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b89c9ed5669f63c4ed6ae791408969640bdcf3e)
- 내용: SoVITS 학습 시 ZeroDivisionError 수정 시도.
- 유형: 수정
- 기여자: RVC-Boss, Tybost
- 관련: [Issue#79](https://github.com/RVC-Boss/GPT-SoVITS/issues/79)
- 2024.01.21 [Commit#ea62d6e0](https://github.com/RVC-Boss/GPT-SoVITS/commit/ea62d6e0cf1efd75287766ea2b55d1c3b69b4fd3)
- 내용: 합성된 오디오가 참조 오디오의 끝부분을 포함하는 문제 크게 감소.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.01.21 [Commit#a87ad522](https://github.com/RVC-Boss/GPT-SoVITS/commit/a87ad5228ed2d729da42019ae1b93171f6a745ef)
- 내용: `cmd-asr.py`가 이제 FunASR 모델이 기본 디렉토리에 포함되어 있는지 확인하고, 없으면 ModelScope에서 다운로드.
- 유형: 기능
- 기여자: RVC-Boss
- 2024.01.21 [Commit#f6147116](https://github.com/RVC-Boss/GPT-SoVITS/commit/f61471166c107ba56ccb7a5137fa9d7c09b2830d)
- 내용: `Config.py``is_share` 매개변수 추가, `True`로 설정하면 WebUI를 공용 네트워크에 매핑.
- 유형: 기능
- 기여자: RVC-Boss
- 2024.01.21 [Commit#102d5081](https://github.com/RVC-Boss/GPT-SoVITS/commit/102d50819e5d24580d6e96085b636b25533ecc7f)
- 내용: `TEMP` 폴더에서 캐시된 오디오 파일 및 기타 파일 정리.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.01.22 [Commit#872134c8](https://github.com/RVC-Boss/GPT-SoVITS/commit/872134c846bcb8f1909a3f5aff68a6aa67643f68)
- 내용: 지나치게 짧은 출력 파일로 인해 참조 오디오가 반복되는 문제 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.22 영어 및 일본어 학습의 기본 지원 테스트 (일본어 학습은 루트 디렉토리에 비영어 특수 문자가 없어야 함).
- 2024.01.22 [PR#124](https://github.com/RVC-Boss/GPT-SoVITS/pull/124)
- 내용: 오디오 경로 확인 개선. 잘못된 입력 경로에서 읽으려고 하면 ffmpeg 오류 대신 경로가 존재하지 않는다고 보고.
- 유형: 최적화
- 기여자: xmimu
- 2024.01.23 [Commit#93c47cd9](https://github.com/RVC-Boss/GPT-SoVITS/commit/93c47cd9f0c53439536eada18879b4ec5a812ae1)
- 내용: Hubert 추출이 NaN 오류를 일으켜 SoVITS/GPT 학습 시 ZeroDivisionError가 발생하는 문제 해결.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.23 [Commit#80fffb0a](https://github.com/RVC-Boss/GPT-SoVITS/commit/80fffb0ad46e4e7f27948d5a57c88cf342088d50)
- 내용: 중국어 단어 분리를 위해 `jieba``jieba_fast`로 교체.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.01.23 [Commit#63625758](https://github.com/RVC-Boss/GPT-SoVITS/commit/63625758a99e645f3218dd167924e01a0e3cf0dc)
- 내용: 모델 파일 정렬 로직 최적화.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.01.23 [Commit#0c691191](https://github.com/RVC-Boss/GPT-SoVITS/commit/0c691191e894c15686e88279745712b3c6dc232f)
- 내용: 추론 WebUI에서 빠른 모델 전환 지원 추가.
- 유형: 기능
- 기여자: RVC-Boss
- 2024.01.25 [Commit#249561e5](https://github.com/RVC-Boss/GPT-SoVITS/commit/249561e5a18576010df6587c274d38cbd9e18b4b)
- 내용: 추론 WebUI에서 불필요한 로그 제거.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.01.25 [PR#183](https://github.com/RVC-Boss/GPT-SoVITS/pull/183), [PR#200](https://github.com/RVC-Boss/GPT-SoVITS/pull/200)
- 내용: Mac에서의 학습 및 추론 지원.
- 유형: 기능
- 기여자: Lion-Wu
- 2024.01.26 [Commit#813cf96e](https://github.com/RVC-Boss/GPT-SoVITS/commit/813cf96e508ba1bb2c658f38c7cc77b797fb4082), [Commit#2d1ddeca](https://github.com/RVC-Boss/GPT-SoVITS/commit/2d1ddeca42db90c3fe2d0cd79480fd544d87f02b)
- 내용: UVR5가 디렉토리를 읽고 자동으로 빠져나가는 문제 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.26 [PR#204](https://github.com/RVC-Boss/GPT-SoVITS/pull/204)
- 내용: 중국어-영어 혼합 및 일본어-영어 혼합 출력 텍스트 지원 추가.
- 유형: 기능
- 기여자: Kakaru Hayate
- 2024.01.26 [Commit#f4148cf7](https://github.com/RVC-Boss/GPT-SoVITS/commit/f4148cf77fb899c22bcdd4e773d2f24ab34a73e7)
- 내용: 출력을 위한 선택적 분할 모드 추가.
- 유형: 기능
- 기여자: RVC-Boss
- 2024.01.26 [Commit#9fe955c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/9fe955c1bf5f94546c9f699141281f2661c8a180)
- 내용: 여러 줄바꿈 문제로 인한 추론 오류 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.26 [Commit#84ee4719](https://github.com/RVC-Boss/GPT-SoVITS/commit/84ee471936b332bc2ccee024d6dfdedab4f0dc7b)
- 내용: 반 정밀도를 지원하지 않는 GPU의 경우 자동으로 단 정밀도 강제; CPU 추론 시 단 정밀도 강제.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.01.28 [PR#238](https://github.com/RVC-Boss/GPT-SoVITS/pull/238)
- 내용: Dockerfile에서 모델 다운로드 프로세스 완료.
- 유형: 수정
- 기여자: breakstring
- 2024.01.28 [PR#257](https://github.com/RVC-Boss/GPT-SoVITS/pull/257)
- 내용: 숫자의 발음이 한자로 변환되는 문제 수정.
- 유형: 수정
- 기여자: duliangang
- 2024.01.28 [Commit#f0cfe397](https://github.com/RVC-Boss/GPT-SoVITS/commit/f0cfe397089a6fd507d678c71adeaab5e7ed0683)
- 내용: GPT 학습 시 체크포인트가 저장되지 않는 문제 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.28 [Commit#b8ae5a27](https://github.com/RVC-Boss/GPT-SoVITS/commit/b8ae5a2761e2654fc0c905498009d3de9de745a8)
- 내용: 제한을 설정하여 불합리한 참조 오디오 길이 제외.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.28 [Commit#698e9655](https://github.com/RVC-Boss/GPT-SoVITS/commit/698e9655132d194b25b86fbbc99d53c8d2cea2a3)
- 내용: 문장 시작 부분에서 몇 글자를 누락시키는 문제 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.29 [Commit#ff977a5f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff977a5f5dc547e0ad82b9e0f1cd95fbc830b2b0)
- 내용: 16 시리즈와 같은 반 정밀도 학습에 문제가 있는 GPU를 위해 학습 구성을 단 정밀도로 변경.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.29 [Commit#172e139f](https://github.com/RVC-Boss/GPT-SoVITS/commit/172e139f45ac26723bc2cf7fac0112f69d6b46ec)
- 내용: 사용 가능한 Colab 버전 테스트 및 업데이트.
- 유형: 기능
- 기여자: RVC-Boss
- 2024.01.29 [PR#135](https://github.com/RVC-Boss/GPT-SoVITS/pull/135)
- 내용: FunASR을 버전 1.0으로 업데이트 및 인터페이스 불일치로 인한 오류 수정.
- 유형: 수정
- 기여자: LauraGPT
- 2024.01.30 [Commit#1c2fa98c](https://github.com/RVC-Boss/GPT-SoVITS/commit/1c2fa98ca8c325dcfb32797d22ff1c2a726d1cb4)
- 내용: 중국어 및 영어 구두점 분할 문제 수정 및 문장 시작과 끝에 구두점 추가.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.01.30 [Commit#74409f35](https://github.com/RVC-Boss/GPT-SoVITS/commit/74409f3570fa1c0ff28d4c65c288a6ce58ca00d2)
- 내용: 구두점으로 분할 지원 추가.
- 유형: 기능
- 기여자: RVC-Boss
- 2024.01.30 [Commit#c42eeccf](https://github.com/RVC-Boss/GPT-SoVITS/commit/c42eeccfdd2d0a0d714ecc8bfc22a12373aca6b7)
- 내용: 초보 사용자가 경로를 복사할 때 큰따옴표를 포함하여 오류가 발생하는 것을 방지하기 위해 모든 경로 관련 항목에서 큰따옴표 자동 제거.
- 유형: 수정
- 기여자: RVC-Boss
## 202402
- 2024.02.01 [Commit#45f73519](https://github.com/RVC-Boss/GPT-SoVITS/commit/45f73519cc41cd17cf816d8b997a9dcb0bee04b6)
- 내용: ASR 경로가 `/`로 끝날 때 파일명 저장 오류 발생 문제 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.02.03 [Commit#dba1a74c](https://github.com/RVC-Boss/GPT-SoVITS/commit/dba1a74ccb0cf19a1b4eb93faf11d4ec2b1fc5d7)
- 내용: UVR5 형식 읽기 오류로 인한 분리 실패 문제 해결.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.02.03 [Commit#3ebff70b](https://github.com/RVC-Boss/GPT-SoVITS/commit/3ebff70b71580ee1f97b3238c9442cbc5aef47c7)
- 내용: 중국어-일본어-영어 혼합 텍스트 자동 분할 및 언어 인식 지원.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.02.03 [PR#377](https://github.com/RVC-Boss/GPT-SoVITS/pull/377)
- 내용: PaddleSpeech Normalizer 도입으로 "xx.xx%"(퍼센트 기호) 및 "元/吨"이 "元吨" 대신 "元每吨"으로 읽히는 문제, 언더스코어 오류 수정.
- 유형: 최적화
- 기여자: KamioRinn
- 2024.02.05 [PR#395](https://github.com/RVC-Boss/GPT-SoVITS/pull/395)
- 내용: 영어 텍스트 프론트엔드 최적화.
- 유형: 최적화
- 기여자: KamioRinn
- 2024.02.06 [Commit#65b463a7](https://github.com/RVC-Boss/GPT-SoVITS/commit/65b463a787f31637b4768cc9a47cab59541d3927)
- 내용: 언어 매개변수 혼동으로 인한 중국어 추론 품질 저하 문제 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 관련: [Issue#391](https://github.com/RVC-Boss/GPT-SoVITS/issues/391)
- 2024.02.06 [PR#403](https://github.com/RVC-Boss/GPT-SoVITS/pull/403)
- 내용: UVR5을 librosa 상위 버전에 적응시킴.
- 유형: 수정
- 기여자: StaryLan
- 2024.02.07 [Commit#14a28510](https://github.com/RVC-Boss/GPT-SoVITS/commit/14a285109a521679f8846589c22da8f656a46ad8)
- 내용: `is_half` 매개변수가 불리언으로 변환되지 않아 발생한 UVR5 inf 오류 수정 (16 시리즈 GPU에서 `inf` 문제 발생).
- 유형: 수정
- 기여자: RVC-Boss
- 2024.02.07 [Commit#d74f888e](https://github.com/RVC-Boss/GPT-SoVITS/commit/d74f888e7ac86063bfeacef95d0e6ddafe42b3b2)
- 내용: Gradio 의존성 문제 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.02.07 [PR#400](https://github.com/RVC-Boss/GPT-SoVITS/pull/400)
- 내용: 일본어 및 영어용 Faster Whisper ASR 통합.
- 유형: 기능
- 기여자: Shadow
- 2024.02.07 [Commit#6469048d](https://github.com/RVC-Boss/GPT-SoVITS/commit/6469048de12a8d6f0bd05d07f031309e61575a38)~[Commit#94ee71d9](https://github.com/RVC-Boss/GPT-SoVITS/commit/94ee71d9d562d10c9a1b96e745c6a6575aa66a10)
- 내용: 데이터셋 준비 시 루트 디렉토리를 비워둘 경우 `.list` 전체 경로 자동 읽기 지원.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.02.08 [Commit#59f35ada](https://github.com/RVC-Boss/GPT-SoVITS/commit/59f35adad85815df27e9c6b33d420f5ebfd8376b)
- 내용: Windows 10 1909 및 번체 중국어 시스템 언어에서 GPT 훈련 중단 문제 해결 시도.
- 유형: 수정
- 기여자: RVC-Boss
- 관련: [Issue#232](https://github.com/RVC-Boss/GPT-SoVITS/issues/232)
- 2024.02.12 [PR#457](https://github.com/RVC-Boss/GPT-SoVITS/pull/457)
- 내용: DPO Loss 훈련 옵션 추가 (GPT 반복 및 문자 누락 완화), 추론 WebUI에 여러 매개변수 노출.
- 유형: 기능
- 기여자: liufenghua
- 2024.02.12 [Commit#2fa74ecb](https://github.com/RVC-Boss/GPT-SoVITS/commit/2fa74ecb941db27d9015583a9be6962898d66730), [Commit#d82f6bbb](https://github.com/RVC-Boss/GPT-SoVITS/commit/d82f6bbb98ba725e6725dcee99b80ce71fb0bf28)
- 내용: Faster Whisper 및 FunASR 로직 최적화, Hugging Face 연결 문제 회피를 위해 미러 다운로드로 전환.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.02.15 [Commit#dd2c4d6d](https://github.com/RVC-Boss/GPT-SoVITS/commit/dd2c4d6d7121bf82d29d0f0e4d788f3b231997c8)
- 내용: 훈련 시 중국어 실험 이름 지원 (이전 버전에서는 오류 발생).
- 유형: 수정
- 기여자: RVC-Boss
- 2024.02.15 [Commit#ccb9b08b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ccb9b08be3c58e102defcc94ff4fd609da9e27ee)~[Commit#895fde46](https://github.com/RVC-Boss/GPT-SoVITS/commit/895fde46e420040ed26aaf0c5b7e99359d9b199b)
- 내용: DPO 훈련을 필수에서 선택 사항으로 변경. 선택 시 배치 크기 자동 절반 감소. 추론 WebUI에서 새 매개변수 전달 문제 수정.
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.02.15 [Commit#7b0c3c67](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b0c3c676495c64b2064aa472bff14b5c06206a5)
- 내용: 중국어 프론트엔드 버그 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2024.02.16 [PR#499](https://github.com/RVC-Boss/GPT-SoVITS/pull/499)
- 내용: 참조 텍스트 없이 입력 지원.
- 유형: 기능
- 기여자: Watchtower-Liu
- 관련: [Issue#475](https://github.com/RVC-Boss/GPT-SoVITS/issues/475)
- 2024.02.17 [PR#509](https://github.com/RVC-Boss/GPT-SoVITS/pull/509), [PR#507](https://github.com/RVC-Boss/GPT-SoVITS/pull/507), [PR#532](https://github.com/RVC-Boss/GPT-SoVITS/pull/532), [PR#556](https://github.com/RVC-Boss/GPT-SoVITS/pull/556), [PR#559](https://github.com/RVC-Boss/GPT-SoVITS/pull/559)
- 내용: 중국어 및 일본어 프론트엔드 처리 최적화.
- 유형: 최적화
- 기여자: KamioRinn, v3cun
- 2024.02.17 [PR#510](https://github.com/RVC-Boss/GPT-SoVITS/pull/511), [PR#511](https://github.com/RVC-Boss/GPT-SoVITS/pull/511)
- 내용: Colab 공개 URL 문제 수정.
- 유형: 수정
- 기여자: ChanningWang2018, RVC-Boss
- 2024.02.21 [PR#557](https://github.com/RVC-Boss/GPT-SoVITS/pull/557)
- 내용: Mac CPU 추론 시 MPS 대신 CPU 사용으로 성능 향상.
- 유형: 최적화
- 기여자: XXXXRT666
- 2024.02.21 [Commit#6da486c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/6da486c15d09e3d99fa42c5e560aaac56b6b4ce1), [Commit#5a171773](https://github.com/RVC-Boss/GPT-SoVITS/commit/5a17177342d2df1e11369f2f4f58d34a3feb1a35)
- 내용: 데이터 처리 시 노이즈 감소 옵션 추가 (16kHz 샘플링 레이트만 유지, 배경 노이즈가 심한 경우에만 사용 권장).
- 유형: 기능
- 기여자: RVC-Boss
- 2024.02.28 [PR#573](https://github.com/RVC-Boss/GPT-SoVITS/pull/573)
- 내용: Mac에서 CPU 추론이 정상적으로 작동하도록 `is_half` 확인 수정.
- 유형: 수정
- 기여자: XXXXRT666
- 2024.02.28 [PR#610](https://github.com/RVC-Boss/GPT-SoVITS/pull/610)
- 내용: UVR5 리버브 제거 모델 설정이 반대로 되어 있던 문제 수정.
- 유형: 수정
- 기여자: Yuze Wang
## 202403
- 2024.03.06 [PR#675](https://github.com/RVC-Boss/GPT-SoVITS/pull/675)
- 내용: CUDA가 없는 경우 Faster Whisper의 자동 CPU 추론 기능 활성화
- 유형: 최적화
- 기여자: ShiroDoMain
- 2024.03.06 [Commit#616be20d](https://github.com/RVC-Boss/GPT-SoVITS/commit/616be20db3cf94f1cd663782fea61b2370704193)
- 내용: Faster Whisper 비중국어 ASR 사용 시 중국어 FunASR 모델을 먼저 다운로드할 필요 없음
- 유형: 최적화
- 기여자: RVC-Boss
- 2024.03.09 [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- 내용: 추론 속도 50% 향상 (RTX3090 + PyTorch 2.2.1 + CU11.8 + Win10 + Py39 환경 테스트)
- 유형: 최적화
- 기여자: GoHomeToMacDonal
- 2024.03.10 [PR#721](https://github.com/RVC-Boss/GPT-SoVITS/pull/721)
- 내용: 빠른 추론 브랜치 'fast_inference_' 추가
- 유형: 기능
- 기여자: ChasonJiang
- 2024.03.13 [PR#761](https://github.com/RVC-Boss/GPT-SoVITS/pull/761)
- 내용: CPU 훈련 지원 추가, macOS에서 CPU를 사용한 훈련 가능
- 유형: 기능
- 기여자: Lion-Wu
- 2024.03.19 [PR#804](https://github.com/RVC-Boss/GPT-SoVITS/pull/804), [PR#812](https://github.com/RVC-Boss/GPT-SoVITS/pull/812), [PR#821](https://github.com/RVC-Boss/GPT-SoVITS/pull/821)
- 내용: 영어 텍스트 프론트엔드 최적화
- 유형: 최적화
- 기여자: KamioRinn
- 2024.03.30 [PR#894](https://github.com/RVC-Boss/GPT-SoVITS/pull/894)
- 내용: API 형식 개선
- 유형: 최적화
- 기여자: KamioRinn
## 202404
- 2024.04.03 [PR#917](https://github.com/RVC-Boss/GPT-SoVITS/pull/917)
- 내용: UVR5 WebUI에서 FFmpeg 명령어 문자열 형식 수정
- 유형: 수정
- 기여자: StaryLan
## 202405
- 2024.05.02 [PR#953](https://github.com/RVC-Boss/GPT-SoVITS/pull/953)
- 내용: SoVITS 훈련 시 VQ를 고정하지 않아 발생하는 품질 저하 문제 해결
- 유형: 수정
- 기여자: hcwu1993
- 관련: [Issue#747](https://github.com/RVC-Boss/GPT-SoVITS/issues/747)
- 2024.05.19 [PR#1102](https://github.com/RVC-Boss/GPT-SoVITS/pull/1102)
- 내용: 훈련 데이터 처리 시 지원되지 않는 언어에 대한 오류 메시지 추가
- 유형: 최적화
- 기여자: StaryLan
- 2024.05.27 [PR#1132](https://github.com/RVC-Boss/GPT-SoVITS/pull/1132)
- 내용: Hubert 추출 버그 수정
- 유형: 수정
- 기여자: XXXXRT666
## 202406
- 2024.06.06 [Commit#99f09c8b](https://github.com/RVC-Boss/GPT-SoVITS/commit/99f09c8bdc155c1f4272b511940717705509582a)
- 내용: WebUI의 GPT 미세조정 시 중국어 입력 텍스트의 BERT 특징을 읽지 않아 추론과 불일치 및 품질 저하가 발생하는 문제 수정
**주의: 이전에 대량의 데이터로 미세조정을 한 경우 품질 향상을 위해 모델 재조정 권장**
- 유형: 수정
- 기여자: RVC-Boss
- 2024.06.07 [PR#1159](https://github.com/RVC-Boss/GPT-SoVITS/pull/1159)
- 내용: `s2_train.py`에서 SoVITS 훈련 진행률 표시 로직 수정
- 유형: 수정
- 기여자: pengzhendong
- 2024.06.10 [Commit#501a74ae](https://github.com/RVC-Boss/GPT-SoVITS/commit/501a74ae96789a26b48932babed5eb4e9483a232)
- 내용: UVR5 MDXNet이 FFmpeg 호출 시 공백 포함 경로와의 호환성을 보장하도록 문자열 형식 수정
- 유형: 수정
- 기여자: RVC-Boss
- 2024.06.10 [PR#1168](https://github.com/RVC-Boss/GPT-SoVITS/pull/1168), [PR#1169](https://github.com/RVC-Boss/GPT-SoVITS/pull/1169)
- 내용: 순수 구두점 및 다중 구두점 텍스트 입력 처리 로직 개선
- 유형: 수정
- 기여자: XXXXRT666
- 관련: [Issue#1165](https://github.com/RVC-Boss/GPT-SoVITS/issues/1165)
- 2024.06.13 [Commit#db506705](https://github.com/RVC-Boss/GPT-SoVITS/commit/db50670598f0236613eefa6f2d5a23a271d82041)
- 내용: CPU 추론 시 기본 배치 크기 소수점 문제 수정
- 유형: 수정
- 기여자: RVC-Boss
- 2024.06.28 [PR#1258](https://github.com/RVC-Boss/GPT-SoVITS/pull/1258), [PR#1265](https://github.com/RVC-Boss/GPT-SoVITS/pull/1265), [PR#1267](https://github.com/RVC-Boss/GPT-SoVITS/pull/1267)
- 내용: 잡음 제거 또는 ASR 처리 중 예외 발생 시 대기 중인 모든 오디오 파일이 종료되는 문제 수정
- 유형: 수정
- 기여자: XXXXRT666
- 2024.06.29 [Commit#a208698e](https://github.com/RVC-Boss/GPT-SoVITS/commit/a208698e775155efc95b187b746d153d0f2847ca)
- 내용: 다중 GPU 훈련 시 다중 프로세스 저장 로직 수정
- 유형: 수정
- 기여자: RVC-Boss
- 2024.06.29 [PR#1251](https://github.com/RVC-Boss/GPT-SoVITS/pull/1251)
- 내용: 중복된 `my_utils.py` 제거
- 유형: 최적화
- 기여자: aoguai
- 관련: [Issue#1189](https://github.com/RVC-Boss/GPT-SoVITS/issues/1189)
## 202407
- 2024.07.06 [PR#1253](https://github.com/RVC-Boss/GPT-SoVITS/pull/1253)
- 내용: 구두점 분할 시 소수점이 분할되는 문제 수정
- 유형: 수정
- 기여자: aoguai
- 2024.07.06 [Commit#b0786f29](https://github.com/RVC-Boss/GPT-SoVITS/commit/b0786f2998f1b2fce6678434524b4e0e8cc716f5)
- 내용: 가속 추론 코드 검증 완료 및 메인 브랜치 병합. 기본 버전과 동일한 추론 효과 보장하며 참조 텍스트 없음 모드에서도 가속 추론 지원
- 유형: 최적화
- 기여자: RVC-Boss, GoHomeToMacDonal
- 관련: [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- 향후 업데이트에서는 `fast_inference` 브랜치의 변경 사항 일관성 검증을 지속할 예정입니다.
- 2024.07.13 [PR#1294](https://github.com/RVC-Boss/GPT-SoVITS/pull/1294), [PR#1298](https://github.com/RVC-Boss/GPT-SoVITS/pull/1298)
- 내용: i18n 스캐닝 리팩토링 및 다국어 구성 파일 업데이트
- 유형: 문서화
- 기여자: StaryLan
- 2024.07.13 [PR#1299](https://github.com/RVC-Boss/GPT-SoVITS/pull/1299)
- 내용: 사용자 파일 경로의 끝 슬래시로 인한 명령줄 오류 문제 수정
- 유형: 수정
- 기여자: XXXXRT666
- 2024.07.19 [PR#756](https://github.com/RVC-Boss/GPT-SoVITS/pull/756)
- 내용: GPT 훈련 시 사용자 정의 `bucket_sampler` 사용 시 훈련 단계 불일치 문제 수정
- 유형: 수정
- 기여자: huangxu1991
- 2024.07.23 [Commit#9588a3c5](https://github.com/RVC-Boss/GPT-SoVITS/commit/9588a3c52d9ebdb20b3c5d74f647d12e7c1171c2), [PR#1340](https://github.com/RVC-Boss/GPT-SoVITS/pull/1340)
- 내용: 합성 중 음성 속도 조절 기능 추가(무작위성 고정 및 속도만 제어 옵션 포함). 이 기능은 `api.py`에 업데이트됨
- 유형: 기능
- 기여자: RVC-Boss, 红血球AE3803
- 2024.07.27 [PR#1306](https://github.com/RVC-Boss/GPT-SoVITS/pull/1306), [PR#1356](https://github.com/RVC-Boss/GPT-SoVITS/pull/1356)
- 내용: BS-RoFormer 보컬 분리 모델 지원 추가
- 유형: 기능
- 기여자: KamioRinn
- 2024.07.27 [PR#1351](https://github.com/RVC-Boss/GPT-SoVITS/pull/1351)
- 내용: 중국어 텍스트 프론트엔드 개선
- 유형: 기능
- 기여자: KamioRinn
## 202408 (V2 버전)
- 2024.08.01 [PR#1355](https://github.com/RVC-Boss/GPT-SoVITS/pull/1355)
- 내용: WebUI에서 파일 처리 시 경로 자동 입력 기능 추가.
- 유형: 정리 작업
- 기여자: XXXXRT666
- 2024.08.01 [Commit#e62e9653](https://github.com/RVC-Boss/GPT-SoVITS/commit/e62e965323a60a76a025bcaa45268c1ddcbcf05c)
- 내용: BS-Roformer FP16 추론 지원 활성화.
- 유형: 성능 최적화
- 기여자: RVC-Boss
- 2024.08.01 [Commit#bce451a2](https://github.com/RVC-Boss/GPT-SoVITS/commit/bce451a2d1641e581e200297d01f219aeaaf7299), [Commit#4c8b7612](https://github.com/RVC-Boss/GPT-SoVITS/commit/4c8b7612206536b8b4435997acb69b25d93acb78)
- 내용: GPU 인식 로직 최적화, 사용자 입력 GPU 인덱스 처리 로직 추가.
- 유형: 정리 작업
- 기여자: RVC-Boss
- 2024.08.02 [Commit#ff6c193f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff6c193f6fb99d44eea3648d82ebcee895860a22)~[Commit#de7ee7c7](https://github.com/RVC-Boss/GPT-SoVITS/commit/de7ee7c7c15a2ec137feb0693b4ff3db61fad758)
- 내용: **GPT-SoVITS V2 모델 추가.**
- 유형: 신규 기능
- 기여자: RVC-Boss
- 2024.08.03 [Commit#8a101474](https://github.com/RVC-Boss/GPT-SoVITS/commit/8a101474b5a4f913b4c94fca2e3ca87d0771bae3)
- 내용: FunASR을 이용한 광둥어 ASR 지원 추가.
- 유형: 신규 기능
- 기여자: RVC-Boss
- 2024.08.03 [PR#1387](https://github.com/RVC-Boss/GPT-SoVITS/pull/1387), [PR#1388](https://github.com/RVC-Boss/GPT-SoVITS/pull/1388)
- 내용: UI 및 타이밍 로직 최적화.
- 유형: 정리 작업
- 기여자: XXXXRT666
- 2024.08.06 [PR#1404](https://github.com/RVC-Boss/GPT-SoVITS/pull/1404), [PR#987](https://github.com/RVC-Boss/GPT-SoVITS/pull/987), [PR#488](https://github.com/RVC-Boss/GPT-SoVITS/pull/488)
- 내용: 다중 발음 문자 처리 로직 최적화 (V2 전용).
- 유형: 수정, 신규 기능
- 기여자: KamioRinn, RVC-Boss
- 2024.08.13 [PR#1422](https://github.com/RVC-Boss/GPT-SoVITS/pull/1422)
- 내용: 참조 오디오 1개만 업로드 가능한 버그 수정; 누락 파일 경고 팝업 추가.
- 유형: 수정, 정리 작업
- 기여자: XXXXRT666
- 2024.08.20 [Issue#1508](https://github.com/RVC-Boss/GPT-SoVITS/issues/1508)
- 내용: 상위 LangSegment 라이브러리에서 SSML 태그로 숫자, 전화번호, 날짜, 시간 최적화 지원.
- 유형: 신규 기능
- 기여자: juntaosun
- 2024.08.20 [PR#1503](https://github.com/RVC-Boss/GPT-SoVITS/pull/1503)
- 내용: API 수정 및 최적화.
- 유형: 수정
- 기여자: KamioRinn
- 2024.08.20 [PR#1490](https://github.com/RVC-Boss/GPT-SoVITS/pull/1490)
- 내용: `fast_inference` 브랜치를 메인 브랜치로 병합.
- 유형: 리팩토링
- 기여자: ChasonJiang
- 2024.08.21 **GPT-SoVITS V2 버전 정식 출시.**
## 202502 (V3 버전)
- 2025.02.11 [Commit#ed207c4b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ed207c4b879d5296e9be3ae5f7b876729a2c43b8)~[Commit#6e2b4918](https://github.com/RVC-Boss/GPT-SoVITS/commit/6e2b49186c5b961f0de41ea485d398dffa9787b4)
- 내용: **GPT-SoVITS V3 모델 추가, 파인튜닝 시 14GB VRAM 필요.**
- 유형: 신규 기능 ([위키 참조](https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)))
- 기여자: RVC-Boss
- 2025.02.12 [PR#2032](https://github.com/RVC-Boss/GPT-SoVITS/pull/2032)
- 내용: 다국어 프로젝트 문서 업데이트.
- 유형: 문서화
- 기여자: StaryLan
- 2025.02.12 [PR#2033](https://github.com/RVC-Boss/GPT-SoVITS/pull/2033)
- 내용: 일본어 문서 업데이트.
- 유형: 문서화
- 기여자: Fyphen
- 2025.02.12 [PR#2010](https://github.com/RVC-Boss/GPT-SoVITS/pull/2010)
- 내용: 어텐션 계산 로직 최적화.
- 유형: 성능 최적화
- 기여자: wzy3650
- 2025.02.12 [PR#2040](https://github.com/RVC-Boss/GPT-SoVITS/pull/2040)
- 내용: 파인튜닝 시 그래디언트 체크포인팅 지원 추가, 12GB VRAM 필요.
- 유형: 신규 기능
- 기여자: Kakaru Hayate
- 2025.02.14 [PR#2047](https://github.com/RVC-Boss/GPT-SoVITS/pull/2047), [PR#2062](https://github.com/RVC-Boss/GPT-SoVITS/pull/2062), [PR#2073](https://github.com/RVC-Boss/GPT-SoVITS/pull/2073)
- 내용: 새로운 언어 분할 도구 전환, 다국어 혼합 텍스트 분할 전략 개선, 숫자 및 영어 처리 로직 최적화.
- 유형: 신규 기능
- 기여자: KamioRinn
- 2025.02.23 [Commit#56509a17](https://github.com/RVC-Boss/GPT-SoVITS/commit/56509a17c918c8d149c48413a672b8ddf437495b)~[Commit#514fb692](https://github.com/RVC-Boss/GPT-SoVITS/commit/514fb692db056a06ed012bc3a5bca2a5b455703e)
- 내용: **GPT-SoVITS V3 모델 LoRA 학습 지원 추가, 파인튜닝 시 8GB GPU 메모리 필요.**
- 유형: 신규 기능
- 기여자: RVC-Boss
- 2025.02.23 [PR#2078](https://github.com/RVC-Boss/GPT-SoVITS/pull/2078)
- 내용: 보컬 및 악기 분리를 위한 Mel Band Roformer 모델 지원 추가.
- 유형: 신규 기능
- 기여자: Sucial
- 2025.02.26 [PR#2112](https://github.com/RVC-Boss/GPT-SoVITS/pull/2112), [PR#2114](https://github.com/RVC-Boss/GPT-SoVITS/pull/2114)
- 내용: 중국어 경로에서 MeCab 오류 수정 (일본어/한국어 또는 다국어 텍스트 분할 전용).
- 유형: 수정
- 기여자: KamioRinn
- 2025.02.27 [Commit#92961c3f](https://github.com/RVC-Boss/GPT-SoVITS/commit/92961c3f68b96009ff2cd00ce614a11b6c4d026f)~[Commit#250b1c73](https://github.com/RVC-Boss/GPT-SoVITS/commit/250b1c73cba60db18148b21ec5fbce01fd9d19bc)
- 내용: **24kHz에서 48kHz 오디오 초해상도 모델 추가** (V3 모델로 24K 오디오 생성 시 "뭉개지는" 현상 완화).
- 유형: 신규 기능
- 기여자: RVC-Boss
- 관련: [Issue#2085](https://github.com/RVC-Boss/GPT-SoVITS/issues/2085), [Issue#2117](https://github.com/RVC-Boss/GPT-SoVITS/issues/2117)
- 2025.02.28 [PR#2123](https://github.com/RVC-Boss/GPT-SoVITS/pull/2123)
- 내용: 다국어 프로젝트 문서 업데이트.
- 유형: 문서화
- 기여자: StaryLan
- 2025.02.28 [PR#2122](https://github.com/RVC-Boss/GPT-SoVITS/pull/2122)
- 내용: 모델이 인식하지 못하는 짧은 CJK 문자에 대해 규칙 기반 검출 적용.
- 유형: 수정
- 기여자: KamioRinn
- 관련: [Issue#2116](https://github.com/RVC-Boss/GPT-SoVITS/issues/2116)
- 2025.02.28 [Commit#c38b1690](https://github.com/RVC-Boss/GPT-SoVITS/commit/c38b16901978c1db79491e16905ea3a37a7cf686), [Commit#a32a2b89](https://github.com/RVC-Boss/GPT-SoVITS/commit/a32a2b893436fad56cc82409121c7fa36a1815d5)
- 내용: 음성 속도 제어 매개변수 추가.
- 유형: 수정
- 기여자: RVC-Boss
- 2025.02.28 **GPT-SoVITS V3 정식 출시**.
## 202503
- 2025.03.31 [PR#2236](https://github.com/RVC-Boss/GPT-SoVITS/pull/2236)
- 내용: 의존성 버전 오류로 인한 문제 수정.
- 유형: 수정
- 기여자: XXXXRT666
- 관련:
- PyOpenJTalk: [Issue#1131](https://github.com/RVC-Boss/GPT-SoVITS/issues/1131), [Issue#2231](https://github.com/RVC-Boss/GPT-SoVITS/issues/2231), [Issue#2233](https://github.com/RVC-Boss/GPT-SoVITS/issues/2233).
- ONNX: [Issue#492](https://github.com/RVC-Boss/GPT-SoVITS/issues/492), [Issue#671](https://github.com/RVC-Boss/GPT-SoVITS/issues/671), [Issue#1192](https://github.com/RVC-Boss/GPT-SoVITS/issues/1192), [Issue#1819](https://github.com/RVC-Boss/GPT-SoVITS/issues/1819), [Issue#1841](https://github.com/RVC-Boss/GPT-SoVITS/issues/1841).
- Pydantic: [Issue#2230](https://github.com/RVC-Boss/GPT-SoVITS/issues/2230), [Issue#2239](https://github.com/RVC-Boss/GPT-SoVITS/issues/2239).
- PyTorch-Lightning: [Issue#2174](https://github.com/RVC-Boss/GPT-SoVITS/issues/2174).
- 2025.03.31 [PR#2241](https://github.com/RVC-Boss/GPT-SoVITS/pull/2241)
- 내용: **SoVITS v3 병렬 추론 지원 활성화.**
- 유형: 신규 기능
- 기여자: ChasonJiang
- 기타 사소한 버그 수정.
- ONNX 런타임 GPU 추론 지원을 위한 패키지 통합 수정:
- 유형: 수정
- 상세:
- G2PW 내 ONNX 모델이 CPU에서 GPU 추론으로 전환, CPU 병목 현상 크게 감소;
- foxjoy dereverberation 모델이 GPU 추론 지원.
## 202504 (V4 버전)
- 2025.04.01 [Commit#6a60e5ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/6a60e5edb1817af4a61c7a5b196c0d0f1407668f)
- 내용: SoVITS v3 병렬 추론 잠금 해제; 비동기 모델 로딩 로직 수정.
- 유형: 수정
- 기여자: RVC-Boss
- 2025.04.07 [PR#2255](https://github.com/RVC-Boss/GPT-SoVITS/pull/2255)
- 내용: Ruff를 이용한 코드 포맷팅; G2PW 링크 업데이트.
- 유형: 스타일
- 기여자: XXXXRT666
- 2025.04.15 [PR#2290](https://github.com/RVC-Boss/GPT-SoVITS/pull/2290)
- 내용: 문서 정리; Python 3.11 지원 추가; 설치 프로그램 업데이트.
- 유형: 정리 작업
- 기여자: XXXXRT666
- 2025.04.20 [PR#2300](https://github.com/RVC-Boss/GPT-SoVITS/pull/2300)
- 내용: Colab, 설치 파일 및 모델 다운로드 업데이트.
- 유형: 정리 작업
- 기여자: XXXXRT666
- 2025.04.20 [Commit#e0c452f0](https://github.com/RVC-Boss/GPT-SoVITS/commit/e0c452f0078e8f7eb560b79a54d75573fefa8355)~[Commit#9d481da6](https://github.com/RVC-Boss/GPT-SoVITS/commit/9d481da610aa4b0ef8abf5651fd62800d2b4e8bf)
- 내용: **GPT-SoVITS V4 모델 추가.**
- 유형: 신규 기능
- 기여자: RVC-Boss
- 2025.04.21 [Commit#8b394a15](https://github.com/RVC-Boss/GPT-SoVITS/commit/8b394a15bce8e1d85c0b11172442dbe7a6017ca2)~[Commit#bc2fe5ec](https://github.com/RVC-Boss/GPT-SoVITS/commit/bc2fe5ec86536c77bb3794b4be263ac87e4fdae6), [PR#2307](https://github.com/RVC-Boss/GPT-SoVITS/pull/2307)
- 내용: V4 병렬 추론 지원 활성화.
- 유형: 신규 기능
- 기여자: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#7405427a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7405427a0ab2a43af63205df401fd6607a408d87)~[Commit#590c83d7](https://github.com/RVC-Boss/GPT-SoVITS/commit/590c83d7667c8d4908f5bdaf2f4c1ba8959d29ff), [PR#2309](https://github.com/RVC-Boss/GPT-SoVITS/pull/2309)
- 내용: 모델 버전 매개변수 전달 오류 수정.
- 유형: 수정
- 기여자: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#fbdab94e](https://github.com/RVC-Boss/GPT-SoVITS/commit/fbdab94e17d605d85841af6f94f40a45976dd1d9), [PR#2310](https://github.com/RVC-Boss/GPT-SoVITS/pull/2310)
- 내용: Numpy와 Numba 버전 불일치 문제 수정; librosa 버전 업데이트.
- 유형: 수정
- 기여자: RVC-Boss, XXXXRT666
- 관련: [Issue#2308](https://github.com/RVC-Boss/GPT-SoVITS/issues/2308)
- **2024.04.22 GPT-SoVITS V4 정식 출시**.
- 2025.04.22 [PR#2311](https://github.com/RVC-Boss/GPT-SoVITS/pull/2311)
- 내용: Gradio 매개변수 업데이트.
- 유형: 정리 작업
- 기여자: XXXXRT666
- 2025.04.25 [PR#2322](https://github.com/RVC-Boss/GPT-SoVITS/pull/2322)
- 내용: Colab/Kaggle 노트북 스크립트 개선.
- 유형: 정리 작업
- 기여자: XXXXRT666
## 202505
- 2025.05.26 [PR#2351](https://github.com/RVC-Boss/GPT-SoVITS/pull/2351)
- 내용: Docker 및 Windows 자동 빌드 스크립트 개선; pre-commit 포맷팅 추가.
- 유형: 정리 작업
- 기여자: XXXXRT666
- 2025.05.26 [PR#2408](https://github.com/RVC-Boss/GPT-SoVITS/pull/2408)
- 내용: 다국어 텍스트 분할 및 인식 로직 최적화.
- 유형: 수정
- 기여자: KamioRinn
- 관련: [Issue#2404](https://github.com/RVC-Boss/GPT-SoVITS/issues/2404)
- 2025.05.26 [PR#2377](https://github.com/RVC-Boss/GPT-SoVITS/pull/2377)
- 내용: 캐싱 전략 구현으로 SoVITS V3/V4 추론 속도 10% 향상.
- 유형: 성능 최적화
- 기여자: Kakaru Hayate
- 2025.05.26 [Commit#4d9d56b1](https://github.com/RVC-Boss/GPT-SoVITS/commit/4d9d56b19638dc434d6eefd9545e4d8639a3e072), [Commit#8c705784](https://github.com/RVC-Boss/GPT-SoVITS/commit/8c705784c50bf438c7b6d0be33a9e5e3cb90e6b2), [Commit#fafe4e7f](https://github.com/RVC-Boss/GPT-SoVITS/commit/fafe4e7f120fba56c5f053c6db30aa675d5951ba)
- 내용: 어노테이션 인터페이스를 업데이트하여 안내 문구를 추가했습니다: 각 페이지 편집 후 반드시 'Submit Text'를 클릭해 주세요. 그렇지 않으면 변경 사항이 저장되지 않습니다.
- 유형: 수정
- 기여자: RVC-Boss
- 2025.05.29 [Commit#1934fc1e](https://github.com/RVC-Boss/GPT-SoVITS/commit/1934fc1e1b22c4c162bba1bbe7d7ebb132944cdc)
- 내용: UVR5 및 ONNX dereverberation 모델에서 FFmpeg이 공백 포함 원본 경로로 MP3/M4A 파일 인코딩 시 오류 수정.
- 유형: 수정
- 기여자: RVC-Boss
## 202406 (V2Pro 시리즈)
- 2025.06.03 [PR#2420](https://github.com/RVC-Boss/GPT-SoVITS/pull/2420)
- 내용: 다국어 프로젝트 문서 업데이트
- 유형: 문서화
- 기여자: StaryLan
- 2025.06.04 [PR#2417](https://github.com/RVC-Boss/GPT-SoVITS/pull/2417)
- 내용: TorchScript를 이용한 V4 내보내기 기능 지원 추가
- 유형: 기능 추가
- 기여자: L-jasmine
- 2025.06.04 [Commit#b7c0c5ca](https://github.com/RVC-Boss/GPT-SoVITS/commit/b7c0c5ca878bcdd419fd86bf80dba431a6653356)~[Commit#298ebb03](https://github.com/RVC-Boss/GPT-SoVITS/commit/298ebb03c5a719388527ae6a586c7ea960344e70)
- 내용: **GPT-SoVITS V2Pro 시리즈 모델 추가 (V2Pro, V2ProPlus)**
- 유형: 기능 추가
- 기여자: RVC-Boss
- 2025.06.05 [PR#2426](https://github.com/RVC-Boss/GPT-SoVITS/pull/2426)
- 내용: `config/inference_webui` 초기화 오류 수정
- 유형: 버그 수정
- 기여자: StaryLan
- 2025.06.05 [PR#2427](https://github.com/RVC-Boss/GPT-SoVITS/pull/2427), [Commit#7d70852a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7d70852a3f67c3b52e3a62857f8663d529efc8cd), [PR#2434](https://github.com/RVC-Boss/GPT-SoVITS/pull/2434)
- 내용: 자동 정밀도 감지 로직 최적화; WebUI 프론트엔드 모듈에 접기 기능 추가
- 유형: 신규 기능
- 기여자: XXXXRT666, RVC-Boss

View File

@@ -0,0 +1,459 @@
<div align="center">
<h1>GPT-SoVITS-WebUI</h1>
소량의 데이터로 음성 변환 및 음성 합성을 지원하는 강력한 WebUI.<br><br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/RVC-Boss/GPT-SoVITS)
<a href="https://trendshift.io/repositories/7033" target="_blank"><img src="https://trendshift.io/api/badge/repositories/7033" alt="RVC-Boss%2FGPT-SoVITS | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
[![Python](https://img.shields.io/badge/python-3.10--3.12-blue?style=for-the-badge&logo=python)](https://www.python.org)
[![GitHub release](https://img.shields.io/github/v/release/RVC-Boss/gpt-sovits?style=for-the-badge&logo=github)](https://github.com/RVC-Boss/gpt-sovits/releases)
[![Train In Colab](https://img.shields.io/badge/Colab-Training-F9AB00?style=for-the-badge&logo=googlecolab)](https://colab.research.google.com/github/RVC-Boss/GPT-SoVITS/blob/main/Colab-WebUI.ipynb)
[![Huggingface](https://img.shields.io/badge/免费在线体验-free_online_demo-yellow.svg?style=for-the-badge&logo=huggingface)](https://lj1995-gpt-sovits-proplus.hf.space/)
[![Image Size](https://img.shields.io/docker/image-size/xxxxrt666/gpt-sovits/latest?style=for-the-badge&logo=docker)](https://hub.docker.com/r/xxxxrt666/gpt-sovits)
[![简体中文](https://img.shields.io/badge/简体中文-阅读文档-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e)
[![English](https://img.shields.io/badge/English-Read%20Docs-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://rentry.co/GPT-SoVITS-guide#/)
[![Change Log](https://img.shields.io/badge/Change%20Log-View%20Updates-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/docs/en/Changelog_EN.md)
[![License](https://img.shields.io/badge/LICENSE-MIT-green.svg?style=for-the-badge&logo=opensourceinitiative)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/LICENSE)
[**English**](../../README.md) | [**中文简体**](../cn/README.md) | [**日本語**](../ja/README.md) | **한국어** | [**Türkçe**](../tr/README.md)
</div>
---
## 기능:
1. **제로샷 텍스트 음성 변환 (TTS):** 5초의 음성 샘플을 입력하면 즉시 텍스트를 음성으로 변환할 수 있습니다.
2. **소량의 데이터 TTS:** 1분의 훈련 데이터만으로 모델을 미세 조정하여 음성 유사도와 실제감을 향상시킬 수 있습니다.
3. **다국어 지원:** 훈련 데이터셋과 다른 언어의 추론을 지원하며, 현재 영어, 일본어, 중국어, 광둥어, 한국어를 지원합니다.
4. **WebUI 도구:** 음성 반주 분리, 자동 훈련 데이터셋 분할, 중국어 자동 음성 인식(ASR) 및 텍스트 주석 등의 도구를 통합하여 초보자가 훈련 데이터셋과 GPT/SoVITS 모델을 생성하는 데 도움을 줍니다.
**데모 비디오를 확인하세요! [demo video](https://www.bilibili.com/video/BV12g4y1m7Uw)**
보지 못한 발화자의 퓨샷(few-shot) 파인튜닝 데모:
https://github.com/RVC-Boss/GPT-SoVITS/assets/129054828/05bee1fa-bdd8-4d85-9350-80c060ab47fb
**사용자 설명서: [简体中文](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e) | [English](https://rentry.co/GPT-SoVITS-guide#/)**
## 설치
### 테스트 통과 환경
| Python Version | PyTorch Version | Device |
| -------------- | ---------------- | ------------- |
| Python 3.10 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.7.0 | CUDA 12.8 |
| Python 3.9 | PyTorch 2.8.0dev | CUDA 12.8 |
| Python 3.9 | PyTorch 2.5.1 | Apple silicon |
| Python 3.11 | PyTorch 2.7.0 | Apple silicon |
| Python 3.9 | PyTorch 2.2.2 | CPU |
### Windows
Windows 사용자라면 (win>=10에서 테스트됨), [통합 패키지를 다운로드](https://huggingface.co/lj1995/GPT-SoVITS-windows-package/resolve/main/GPT-SoVITS-v3lora-20250228.7z?download=true)한 후 압축을 풀고 _go-webui.bat_ 파일을 더블 클릭하면 GPT-SoVITS-WebUI를 시작할 수 있습니다.
```pwsh
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pwsh -F install.ps1 --Device <CU126|CU128|CPU> --Source <HF|HF-Mirror|ModelScope> [--DownloadUVR5]
```
### Linux
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <CU126|CU128|ROCM|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### macOS
**주의: Mac에서 GPU로 훈련된 모델은 다른 OS에서 훈련된 모델에 비해 품질이 낮습니다. 해당 문제를 해결하기 전까지 MacOS에선 CPU를 사용하여 훈련을 진행합니다.**
다음 명령어를 실행하여 이 프로젝트를 설치하세요
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <MPS|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### 수동 설치
#### 의존성 설치
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pip install -r extra-req.txt --no-deps
pip install -r requirements.txt
```
#### FFmpeg 설치
##### Conda 사용자
```bash
conda activate GPTSoVits
conda install ffmpeg
```
##### Ubuntu/Debian 사용자
```bash
sudo apt install ffmpeg
sudo apt install libsox-dev
```
##### Windows 사용자
[ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe)와 [ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe)를 GPT-SoVITS root 디렉토리에 넣습니다
[Visual Studio 2017](https://aka.ms/vs/17/release/vc_redist.x86.exe) 설치
##### MacOS 사용자
```bash
brew install ffmpeg
```
### GPT-SoVITS 실행하기 (Docker 사용)
#### Docker 이미지 선택
코드베이스가 빠르게 업데이트되는 반면 Docker 이미지 릴리스 주기는 느리기 때문에 다음을 참고하세요:
- [Docker Hub](https://hub.docker.com/r/xxxxrt666/gpt-sovits)에서 최신 이미지 태그를 확인하세요
- 환경에 맞는 적절한 이미지 태그를 선택하세요
- `Lite` 는 Docker 이미지에 ASR 모델과 UVR5 모델이 **포함되어 있지 않음**을 의미합니다. UVR5 모델은 사용자가 직접 다운로드해야 하며, ASR 모델은 필요 시 프로그램이 자동으로 다운로드합니다
- Docker Compose 실행 시, 해당 아키텍처에 맞는 이미지(amd64 또는 arm64)가 자동으로 다운로드됩니다
- Docker Compose는 현재 디렉터리의 **모든 파일**을 마운트합니다. Docker 이미지를 사용하기 전에 프로젝트 루트 디렉터리로 이동하여 코드를 **최신 상태로 업데이트**하세요
- 선택 사항: 최신 변경사항을 반영하려면 제공된 Dockerfile을 사용하여 로컬에서 직접 이미지를 빌드할 수 있습니다
#### 환경 변수
- `is_half`: 반정밀도(fp16) 사용 여부를 제어합니다. GPU가 지원하는 경우 `true`로 설정하면 메모리 사용량을 줄일 수 있습니다
#### 공유 메모리 설정
Windows(Docker Desktop)에서는 기본 공유 메모리 크기가 작아 예기치 않은 동작이 발생할 수 있습니다. 시스템 메모리 상황에 따라 Docker Compose 파일에서 `shm_size`를 (예: `16g`)로 증가시키는 것이 좋습니다
#### 서비스 선택
`docker-compose.yaml` 파일에는 두 가지 서비스 유형이 정의되어 있습니다:
- `GPT-SoVITS-CU126``GPT-SoVITS-CU128`: 전체 기능을 포함한 풀 버전
- `GPT-SoVITS-CU126-Lite``GPT-SoVITS-CU128-Lite`: 의존성이 줄어든 경량 버전
특정 서비스를 Docker Compose로 실행하려면 다음 명령을 사용하세요:
```bash
docker compose run --service-ports <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128>
```
#### Docker 이미지 직접 빌드하기
직접 이미지를 빌드하려면 다음 명령어를 사용하세요:
```bash
bash docker_build.sh --cuda <12.6|12.8> [--lite]
```
#### 실행 중인 컨테이너 접속하기 (Bash Shell)
컨테이너가 백그라운드에서 실행 중일 때 다음 명령어로 셸에 접속할 수 있습니다:
```bash
docker exec -it <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128> bash
```
## 사전 학습된 모델
**`install.sh`가 성공적으로 실행되면 No.1,2,3 은 건너뛰어도 됩니다.**
1. [GPT-SoVITS Models](https://huggingface.co/lj1995/GPT-SoVITS) 에서 사전 학습된 모델을 다운로드하고, `GPT_SoVITS/pretrained_models` 디렉토리에 배치하세요.
2. [G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) 에서 모델을 다운로드하고 압축을 풀어 `G2PWModel`로 이름을 변경한 후, `GPT_SoVITS/text` 디렉토리에 배치하세요. (중국어 TTS 전용)
3. UVR5 (보컬/반주 분리 & 잔향 제거 추가 기능)의 경우, [UVR5 Weights](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/uvr5_weights) 에서 모델을 다운로드하고 `tools/uvr5/uvr5_weights` 디렉토리에 배치하세요.
- UVR5에서 bs_roformer 또는 mel_band_roformer 모델을 사용할 경우, 모델과 해당 설정 파일을 수동으로 다운로드하여 `tools/UVR5/UVR5_weights` 폴더에 저장할 수 있습니다. **모델 파일과 설정 파일의 이름은 확장자를 제외하고 동일한 이름을 가지도록 해야 합니다**. 또한, 모델과 설정 파일 이름에는 **"roformer"**가 포함되어야 roformer 클래스의 모델로 인식됩니다.
- 모델 이름과 설정 파일 이름에 **모델 유형을 직접 지정하는 것이 좋습니다**. 예: mel_mand_roformer, bs_roformer. 지정하지 않으면 설정 파일을 기준으로 특성을 비교하여 어떤 유형의 모델인지를 판단합니다. 예를 들어, 모델 `bs_roformer_ep_368_sdr_12.9628.ckpt`와 해당 설정 파일 `bs_roformer_ep_368_sdr_12.9628.yaml`은 한 쌍입니다. `kim_mel_band_roformer.ckpt``kim_mel_band_roformer.yaml`도 한 쌍입니다.
4. 중국어 ASR (추가 기능)의 경우, [Damo ASR Model](https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/files), [Damo VAD Model](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/files) 및 [Damo Punc Model](https://modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/files) 에서 모델을 다운로드하고, `tools/asr/models` 디렉토리에 배치하세요.
5. 영어 또는 일본어 ASR (추가 기능)의 경우, [Faster Whisper Large V3](https://huggingface.co/Systran/faster-whisper-large-v3) 에서 모델을 다운로드하고, `tools/asr/models` 디렉토리에 배치하세요. 또한, [다른 모델](https://huggingface.co/Systran) 은 더 적은 디스크 용량으로 비슷한 효과를 가질 수 있습니다.
## 데이터셋 형식
텍스트 음성 합성(TTS) 주석 .list 파일 형식:
```
vocal_path|speaker_name|language|text
```
언어 사전:
- 'zh': 중국어
- 'ja': 일본어
- 'en': 영어
예시:
```
D:\GPT-SoVITS\xxx/xxx.wav|xxx|en|I like playing Genshin.
```
## 미세 조정 및 추론
### WebUI 열기
#### 통합 패키지 사용자
`go-webui.bat`을 더블 클릭하거나 `go-webui.ps1`를 사용하십시오.
V1으로 전환하려면, `go-webui-v1.bat`을 더블 클릭하거나 `go-webui-v1.ps1`를 사용하십시오.
#### 기타
```bash
python webui.py <언어(옵션)>
```
V1으로 전환하려면,
```bash
python webui.py v1 <언어(옵션)>
```
또는 WebUI에서 수동으로 버전을 전환하십시오.
### 미세 조정
#### 경로 자동 채우기가 지원됩니다
1. 오디오 경로를 입력하십시오.
2. 오디오를 작은 청크로 분할하십시오.
3. 노이즈 제거(옵션)
4. ASR 수행
5. ASR 전사를 교정하십시오.
6. 다음 탭으로 이동하여 모델을 미세 조정하십시오.
### 추론 WebUI 열기
#### 통합 패키지 사용자
`go-webui-v2.bat`을 더블 클릭하거나 `go-webui-v2.ps1`를 사용한 다음 `1-GPT-SoVITS-TTS/1C-inference`에서 추론 webui를 엽니다.
#### 기타
```bash
python GPT_SoVITS/inference_webui.py <언어(옵션)>
```
또는
```bash
python webui.py
```
그런 다음 `1-GPT-SoVITS-TTS/1C-inference`에서 추론 webui를 엽니다.
## V2 릴리스 노트
새로운 기능:
1. 한국어 및 광둥어 지원
2. 최적화된 텍스트 프론트엔드
3. 사전 학습 모델이 2천 시간에서 5천 시간으로 확장
4. 저품질 참조 오디오에 대한 합성 품질 향상
[자세한 내용](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v2%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
V1 환경에서 V2를 사용하려면:
1. `pip install -r requirements.txt`를 사용하여 일부 패키지 업데이트
2. github에서 최신 코드를 클론하십시오.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main/gsv-v2final-pretrained)에서 V2 사전 학습 모델을 다운로드하여 `GPT_SoVITS/pretrained_models/gsv-v2final-pretrained`에 넣으십시오.
중국어 V2 추가: [G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) (G2PW 모델을 다운로드하여 압축을 풀고 `G2PWModel`로 이름을 변경한 다음 `GPT_SoVITS/text`에 배치합니다.)
## V3 릴리스 노트
새로운 기능:
1. 음색 유사성이 더 높아져 목표 음성에 대한 학습 데이터가 적게 필요합니다. (기본 모델을 직접 사용하여 미세 조정 없이 음색 유사성이 크게 향상됩니다.)
2. GPT 모델이 더 안정적이며 반복 및 생략이 적고, 더 풍부한 감정 표현을 가진 음성을 생성하기가 더 쉽습니다.
[자세한 내용](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
v2 환경에서 v3 사용하기:
1. `pip install -r requirements.txt`로 일부 패키지를 업데이트합니다.
2. 최신 코드를 github 에서 클론합니다.
3. v3 사전 훈련된 모델(s1v3.ckpt, s2Gv3.pth, 그리고 models--nvidia--bigvgan_v2_24khz_100band_256x 폴더)을 [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main)에서 다운로드하여 `GPT_SoVITS/pretrained_models` 폴더에 넣습니다.
추가: 오디오 슈퍼 해상도 모델에 대해서는 [다운로드 방법](../../tools/AP_BWE_main/24kto48k/readme.txt)을 참고하세요.
## V4 릴리스 노트
신규 기능:
1. **V4는 V3에서 발생하는 비정수 배율 업샘플링으로 인한 금속성 잡음 문제를 수정했으며, 소리가 먹먹해지는 것을 방지하기 위해 기본적으로 48kHz 오디오를 출력합니다 (V3는 기본적으로 24kHz만 지원)**. 개발자는 V4를 V3의 직접적인 대체 버전으로 보고 있지만 추가 테스트가 필요합니다.
[자세히 보기](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3v4%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
V1/V2/V3 환경에서 V4로 전환 방법:
1. 일부 의존 패키지를 업데이트하기 위해 `pip install -r requirements.txt` 명령어를 실행하세요.
2. GitHub에서 최신 코드를 클론하세요.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main)에서 V4 사전 학습 모델(`gsv-v4-pretrained/s2v4.ckpt``gsv-v4-pretrained/vocoder.pth`)을 다운로드하고 `GPT_SoVITS/pretrained_models` 디렉토리에 넣으세요.
## V2Pro 릴리스 노트
신규 기능:
1. **V2보다 약간 높은 VRAM 사용량이지만 성능은 V4보다 우수하며, V2 수준의 하드웨어 비용과 속도를 유지합니다**.
[자세히 보기](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90features-(%E5%90%84%E7%89%88%E6%9C%AC%E7%89%B9%E6%80%A7)>)
2. V1/V2와 V2Pro 시리즈는 유사한 특징을 가지며, V3/V4도 비슷한 기능을 가지고 있습니다. 평균 음질이 낮은 학습 데이터셋에서는 V1/V2/V2Pro가 좋은 결과를 내지만 V3/V4는 그렇지 못합니다. 또한 V3/V4의 합성 음색은 전체 학습 데이터셋보다는 참고 음성에 더 가깝습니다.
V1/V2/V3/V4 환경에서 V2Pro로 전환 방법:
1. 일부 의존 패키지를 업데이트하기 위해 `pip install -r requirements.txt` 명령어를 실행하세요.
2. GitHub에서 최신 코드를 클론하세요.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main)에서 V2Pro 사전 학습 모델(`v2Pro/s2Dv2Pro.pth`, `v2Pro/s2Gv2Pro.pth`, `v2Pro/s2Dv2ProPlus.pth`, `v2Pro/s2Gv2ProPlus.pth`, 및 `sv/pretrained_eres2netv2w24s4ep4.ckpt`)을 다운로드하고 `GPT_SoVITS/pretrained_models` 디렉토리에 넣으세요.
## 할 일 목록
- [x] **최우선순위:**
- [x] 일본어 및 영어 지역화.
- [x] 사용자 가이드.
- [x] 일본어 및 영어 데이터셋 미세 조정 훈련.
- [ ] **기능:**
- [x] 제로샷 음성 변환 (5초) / 소량의 음성 변환 (1분).
- [x] TTS 속도 제어.
- [ ] ~~향상된 TTS 감정 제어.~~
- [ ] SoVITS 토큰 입력을 단어 확률 분포로 변경해 보세요.
- [x] 영어 및 일본어 텍스트 프론트 엔드 개선.
- [ ] 작은 크기와 큰 크기의 TTS 모델 개발.
- [x] Colab 스크립트.
- [ ] 훈련 데이터셋 확장 (2k 시간에서 10k 시간).
- [x] 더 나은 sovits 기본 모델 (향상된 오디오 품질).
- [ ] 모델 블렌딩.
## (추가적인) 명령줄에서 실행하는 방법
명령줄을 사용하여 UVR5용 WebUI 열기
```bash
python tools/uvr5/webui.py "<infer_device>" <is_half> <webui_port_uvr5>
```
<!-- 브라우저를 열 수 없는 경우 UVR 처리를 위해 아래 형식을 따르십시오. 이는 오디오 처리를 위해 mdxnet을 사용하는 것입니다.
```
python mdxnet.py --model --input_root --output_vocal --output_ins --agg_level --format --device --is_half_precision
``` -->
명령줄을 사용하여 데이터세트의 오디오 분할을 수행하는 방법은 다음과 같습니다.
```bash
python audio_slicer.py \
--input_path "<path_to_original_audio_file_or_directory>" \
--output_root "<directory_where_subdivided_audio_clips_will_be_saved>" \
--threshold <volume_threshold> \
--min_length <minimum_duration_of_each_subclip> \
--min_interval <shortest_time_gap_between_adjacent_subclips>
--hop_size <step_size_for_computing_volume_curve>
```
명령줄을 사용하여 데이터 세트 ASR 처리를 수행하는 방법입니다(중국어만 해당).
```bash
python tools/asr/funasr_asr.py -i <input> -o <output>
```
ASR 처리는 Faster_Whisper(중국어를 제외한 ASR 마킹)를 통해 수행됩니다.
(진행률 표시줄 없음, GPU 성능으로 인해 시간 지연이 발생할 수 있음)
```bash
python ./tools/asr/fasterwhisper_asr.py -i <input> -o <output> -l <language> -p <precision>
```
사용자 정의 목록 저장 경로가 활성화되었습니다.
## 감사의 말
다음 프로젝트와 기여자들에게 특별히 감사드립니다:
### 이론 연구
- [ar-vits](https://github.com/innnky/ar-vits)
- [SoundStorm](https://github.com/yangdongchao/SoundStorm/tree/master/soundstorm/s1/AR)
- [vits](https://github.com/jaywalnut310/vits)
- [TransferTTS](https://github.com/hcy71o/TransferTTS/blob/master/models.py#L556)
- [contentvec](https://github.com/auspicious3000/contentvec/)
- [hifi-gan](https://github.com/jik876/hifi-gan)
- [fish-speech](https://github.com/fishaudio/fish-speech/blob/main/tools/llama/generate.py#L41)
- [f5-TTS](https://github.com/SWivid/F5-TTS/blob/main/src/f5_tts/model/backbones/dit.py)
- [shortcut flow matching](https://github.com/kvfrans/shortcut-models/blob/main/targets_shortcut.py)
### 사전 학습 모델
- [Chinese Speech Pretrain](https://github.com/TencentGameMate/chinese_speech_pretrain)
- [Chinese-Roberta-WWM-Ext-Large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large)
- [BigVGAN](https://github.com/NVIDIA/BigVGAN)
- [eresnetv2](https://modelscope.cn/models/iic/speech_eres2netv2w24s4ep4_sv_zh-cn_16k-common)
### 추론용 텍스트 프론트엔드
- [paddlespeech zh_normalization](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/zh_normalization)
- [split-lang](https://github.com/DoodleBears/split-lang)
- [g2pW](https://github.com/GitYCC/g2pW)
- [pypinyin-g2pW](https://github.com/mozillazg/pypinyin-g2pW)
- [paddlespeech g2pw](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/g2pw)
### WebUI 도구
- [ultimatevocalremovergui](https://github.com/Anjok07/ultimatevocalremovergui)
- [audio-slicer](https://github.com/openvpi/audio-slicer)
- [SubFix](https://github.com/cronrpc/SubFix)
- [FFmpeg](https://github.com/FFmpeg/FFmpeg)
- [gradio](https://github.com/gradio-app/gradio)
- [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
- [FunASR](https://github.com/alibaba-damo-academy/FunASR)
- [AP-BWE](https://github.com/yxlu-0102/AP-BWE)
@Naozumi520 님께 감사드립니다. 광둥어 학습 자료를 제공해 주시고, 광둥어 관련 지식을 지도해 주셔서 감사합니다.
## 모든 기여자들에게 감사드립니다 ;)
<a href="https://github.com/RVC-Boss/GPT-SoVITS/graphs/contributors" target="_blank">
<img src="https://contrib.rocks/image?repo=RVC-Boss/GPT-SoVITS" />
</a>

View File

@@ -0,0 +1,580 @@
# Güncelleme Günlüğü
## 202401
## 202401
- 2024.01.21 [PR#108](https://github.com/RVC-Boss/GPT-SoVITS/pull/108)
- İçerik: WebUI'ya İngilizce sistem çeviri desteği eklendi.
- Tür: Dokümantasyon
- Katkıda Bulunan: D3lik
- 2024.01.21 [Commit#7b89c9ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b89c9ed5669f63c4ed6ae791408969640bdcf3e)
- İçerik: SoVITS eğitiminde ZeroDivisionError düzeltme girişimi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss, Tybost
- İlgili: [Issue#79](https://github.com/RVC-Boss/GPT-SoVITS/issues/79)
- 2024.01.21 [Commit#ea62d6e0](https://github.com/RVC-Boss/GPT-SoVITS/commit/ea62d6e0cf1efd75287766ea2b55d1c3b69b4fd3)
- İçerik: Sentezlenen sesin referans sesin sonunu içerme sorunu önemli ölçüde azaltıldı.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.01.21 [Commit#a87ad522](https://github.com/RVC-Boss/GPT-SoVITS/commit/a87ad5228ed2d729da42019ae1b93171f6a745ef)
- İçerik: `cmd-asr.py` artık FunASR modelinin varsayılan dizinde olup olmadığını kontrol ediyor ve değilse ModelScope'tan indiriyor.
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.01.21 [Commit#f6147116](https://github.com/RVC-Boss/GPT-SoVITS/commit/f61471166c107ba56ccb7a5137fa9d7c09b2830d)
- İçerik: `Config.py`'a `is_share` parametresi eklendi, `True` olarak ayarlanırsa WebUI genel ağa eşlenir.
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.01.21 [Commit#102d5081](https://github.com/RVC-Boss/GPT-SoVITS/commit/102d50819e5d24580d6e96085b636b25533ecc7f)
- İçerik: `TEMP` klasöründeki önbelleğe alınmış ses dosyaları ve diğer dosyalar temizlendi.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.01.22 [Commit#872134c8](https://github.com/RVC-Boss/GPT-SoVITS/commit/872134c846bcb8f1909a3f5aff68a6aa67643f68)
- İçerik: Aşırı kısa çıktı dosyalarının referans sesi tekrarlaması sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.22 İngilizce ve Japonca eğitim için yerel destek test edildi (Japonca eğitim için kök dizinde İngilizce olmayan özel karakterler olmamalı).
- 2024.01.22 [PR#124](https://github.com/RVC-Boss/GPT-SoVITS/pull/124)
- İçerik: Ses yolu kontrolü iyileştirildi. Yanlış giriş yolundan okuma girişiminde ffmpeg hatası yerine yolun mevcut olmadığı bildiriliyor.
- Tür: Optimizasyon
- Katkıda Bulunan: xmimu
- 2024.01.23 [Commit#93c47cd9](https://github.com/RVC-Boss/GPT-SoVITS/commit/93c47cd9f0c53439536eada18879b4ec5a812ae1)
- İçerik: Hubert çıkarımının NaN hatalarına neden olarak SoVITS/GPT eğitiminde ZeroDivisionError'a yol açması sorunu çözüldü.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.23 [Commit#80fffb0a](https://github.com/RVC-Boss/GPT-SoVITS/commit/80fffb0ad46e4e7f27948d5a57c88cf342088d50)
- İçerik: Çince kelime bölme için `jieba`, `jieba_fast` ile değiştirildi.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.01.23 [Commit#63625758](https://github.com/RVC-Boss/GPT-SoVITS/commit/63625758a99e645f3218dd167924e01a0e3cf0dc)
- İçerik: Model dosyası sıralama mantığı optimize edildi.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.01.23 [Commit#0c691191](https://github.com/RVC-Boss/GPT-SoVITS/commit/0c691191e894c15686e88279745712b3c6dc232f)
- İçerik: Çıkarım WebUI'ında hızlı model değiştirme desteği eklendi.
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.01.25 [Commit#249561e5](https://github.com/RVC-Boss/GPT-SoVITS/commit/249561e5a18576010df6587c274d38cbd9e18b4b)
- İçerik: Çıkarım WebUI'ında gereksiz günlükler kaldırıldı.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.01.25 [PR#183](https://github.com/RVC-Boss/GPT-SoVITS/pull/183), [PR#200](https://github.com/RVC-Boss/GPT-SoVITS/pull/200)
- İçerik: Mac'te eğitim ve çıkarım desteği eklendi.
- Tür: Özellik
- Katkıda Bulunan: Lion-Wu
- 2024.01.26 [Commit#813cf96e](https://github.com/RVC-Boss/GPT-SoVITS/commit/813cf96e508ba1bb2c658f38c7cc77b797fb4082), [Commit#2d1ddeca](https://github.com/RVC-Boss/GPT-SoVITS/commit/2d1ddeca42db90c3fe2d0cd79480fd544d87f02b)
- İçerik: UVR5'in dizinleri okuyup otomatik olarak çıkması sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.26 [PR#204](https://github.com/RVC-Boss/GPT-SoVITS/pull/204)
- İçerik: Çince-İngilizce ve Japonca-İngilizce karışık çıktı metinleri için destek eklendi.
- Tür: Özellik
- Katkıda Bulunan: Kakaru Hayate
- 2024.01.26 [Commit#f4148cf7](https://github.com/RVC-Boss/GPT-SoVITS/commit/f4148cf77fb899c22bcdd4e773d2f24ab34a73e7)
- İçerik: Çıktı için isteğe bağlı bölümleme modu eklendi.
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.01.26 [Commit#9fe955c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/9fe955c1bf5f94546c9f699141281f2661c8a180)
- İçerik: Birden fazla satır sonunun çıkarım hatasına neden olması sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.26 [Commit#84ee4719](https://github.com/RVC-Boss/GPT-SoVITS/commit/84ee471936b332bc2ccee024d6dfdedab4f0dc7b)
- İçerik: Yarım hassasiyeti desteklemeyen GPU'lar için otomatik olarak tek hassasiyet zorlandı; CPU çıkarımında tek hassasiyet zorunlu kılındı.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.01.28 [PR#238](https://github.com/RVC-Boss/GPT-SoVITS/pull/238)
- İçerik: Dockerfile'da model indirme süreci tamamlandı.
- Tür: Düzeltme
- Katkıda Bulunan: breakstring
- 2024.01.28 [PR#257](https://github.com/RVC-Boss/GPT-SoVITS/pull/257)
- İçerik: Sayıların telaffuzunun Çince karakterlere dönüşmesi sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: duliangang
- 2024.01.28 [Commit#f0cfe397](https://github.com/RVC-Boss/GPT-SoVITS/commit/f0cfe397089a6fd507d678c71adeaab5e7ed0683)
- İçerik: GPT eğitiminde kontrol noktalarının kaydedilmemesi sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.28 [Commit#b8ae5a27](https://github.com/RVC-Boss/GPT-SoVITS/commit/b8ae5a2761e2654fc0c905498009d3de9de745a8)
- İçerik: Kısıtlamalar ayarlanarak mantıksız referans ses uzunlukları hariç tutuldu.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.28 [Commit#698e9655](https://github.com/RVC-Boss/GPT-SoVITS/commit/698e9655132d194b25b86fbbc99d53c8d2cea2a3)
- İçerik: Cümlelerin başında birkaç karakterin yutulması sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.29 [Commit#ff977a5f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff977a5f5dc547e0ad82b9e0f1cd95fbc830b2b0)
- İçerik: 16 serisi gibi yarım hassasiyet eğitiminde sorun yaşayan GPU'lar için eğitim yapılandırmaları tek hassasiyete değiştirildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.29 [Commit#172e139f](https://github.com/RVC-Boss/GPT-SoVITS/commit/172e139f45ac26723bc2cf7fac0112f69d6b46ec)
- İçerik: Kullanılabilir Colab sürümü test edildi ve güncellendi.
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.01.29 [PR#135](https://github.com/RVC-Boss/GPT-SoVITS/pull/135)
- İçerik: FunASR Sürüm 1.0'a güncellendi ve arayüz uyumsuzluğundan kaynaklanan hatalar düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: LauraGPT
- 2024.01.30 [Commit#1c2fa98c](https://github.com/RVC-Boss/GPT-SoVITS/commit/1c2fa98ca8c325dcfb32797d22ff1c2a726d1cb4)
- İçerik: Çince ve İngilizce noktalama işaretlerinin bölünmesi sorunları düzeltildi ve cümle başlarına ve sonlarına noktalama işaretleri eklendi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.01.30 [Commit#74409f35](https://github.com/RVC-Boss/GPT-SoVITS/commit/74409f3570fa1c0ff28d4c65c288a6ce58ca00d2)
- İçerik: Noktalama işaretlerine göre bölme desteği eklendi.
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.01.30 [Commit#c42eeccf](https://github.com/RVC-Boss/GPT-SoVITS/commit/c42eeccfdd2d0a0d714ecc8bfc22a12373aca6b7)
- İçerik: Yeni kullanıcıların yolları çift tırnak içinde kopyalayarak hata yapmasını önlemek için tüm yol ile ilgili girdilerden çift tırnaklar otomatik olarak kaldırıldı.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
## 202402
- 2024.02.01 [Commit#45f73519](https://github.com/RVC-Boss/GPT-SoVITS/commit/45f73519cc41cd17cf816d8b997a9dcb0bee04b6)
- İçerik: ASR yolunun `/` ile bitmesi durumunda dosya adı kaydetme hatası düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.02.03 [Commit#dba1a74c](https://github.com/RVC-Boss/GPT-SoVITS/commit/dba1a74ccb0cf19a1b4eb93faf11d4ec2b1fc5d7)
- İçerik: UVR5 format okuma hatası nedeniyle ayrıştırma başarısızlığı sorunu çözüldü.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.02.03 [Commit#3ebff70b](https://github.com/RVC-Boss/GPT-SoVITS/commit/3ebff70b71580ee1f97b3238c9442cbc5aef47c7)
- İçerik: Çince-Japonca-İngilizce karışık metinler için otomatik bölümleme ve dil tanıma desteği eklendi.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.02.03 [PR#377](https://github.com/RVC-Boss/GPT-SoVITS/pull/377)
- İçerik: PaddleSpeech Normalizer entegre edildi - "xx.xx%" (yüzde sembolü) ve "元/吨" ifadelerinin "元吨" yerine "元每吨" olarak okunması sorunu ile alt çizgi hataları düzeltildi.
- Tür: Optimizasyon
- Katkıda Bulunan: KamioRinn
- 2024.02.05 [PR#395](https://github.com/RVC-Boss/GPT-SoVITS/pull/395)
- İçerik: İngilizce metin ön uç işleme optimizasyonu yapıldı.
- Tür: Optimizasyon
- Katkıda Bulunan: KamioRinn
- 2024.02.06 [Commit#65b463a7](https://github.com/RVC-Boss/GPT-SoVITS/commit/65b463a787f31637b4768cc9a47cab59541d3927)
- İçerik: Dil parametrelerinin karışması nedeniyle Çince çıkarım kalitesinin düşmesi sorunu giderildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- İlgili: [Issue#391](https://github.com/RVC-Boss/GPT-SoVITS/issues/391)
- 2024.02.06 [PR#403](https://github.com/RVC-Boss/GPT-SoVITS/pull/403)
- İçerik: UVR5, librosa'nın daha yeni sürümlerine uyumlu hale getirildi.
- Tür: Düzeltme
- Katkıda Bulunan: StaryLan
- 2024.02.07 [Commit#14a28510](https://github.com/RVC-Boss/GPT-SoVITS/commit/14a285109a521679f8846589c22da8f656a46ad8)
- İçerik: `is_half` parametresinin boolean'a dönüştürülmemesi nedeniyle oluşan UVR5 inf hatası düzeltildi (16 serisi GPU'larda `inf` sorununa neden oluyordu).
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.02.07 [Commit#d74f888e](https://github.com/RVC-Boss/GPT-SoVITS/commit/d74f888e7ac86063bfeacef95d0e6ddafe42b3b2)
- İçerik: Gradio bağımlılık sorunları giderildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.02.07 [PR#400](https://github.com/RVC-Boss/GPT-SoVITS/pull/400)
- İçerik: Japonca ve İngilizce için Faster Whisper ASR entegrasyonu yapıldı.
- Tür: Özellik
- Katkıda Bulunan: Shadow
- 2024.02.07 [Commit#6469048d](https://github.com/RVC-Boss/GPT-SoVITS/commit/6469048de12a8d6f0bd05d07f031309e61575a38)~[Commit#94ee71d9](https://github.com/RVC-Boss/GPT-SoVITS/commit/94ee71d9d562d10c9a1b96e745c6a6575aa66a10)
- İçerik: Veri seti hazırlarken kök dizin boş bırakılırsa `.list` dosya yollarının otomatik okunması desteği eklendi.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.02.08 [Commit#59f35ada](https://github.com/RVC-Boss/GPT-SoVITS/commit/59f35adad85815df27e9c6b33d420f5ebfd8376b)
- İçerik: Windows 10 1909 ve Geleneksel Çince sistem dilinde GPT eğitiminin donma sorunu çözülmeye çalışıldı.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- İlgili: [Issue#232](https://github.com/RVC-Boss/GPT-SoVITS/issues/232)
- 2024.02.12 [PR#457](https://github.com/RVC-Boss/GPT-SoVITS/pull/457)
- İçerik: DPO Loss eğitim seçeneği eklendi (GPT tekrarlarını ve karakter atlamalarını azaltmak için), çıkarım WebUI'sına yeni parametreler eklendi.
- Tür: Özellik
- Katkıda Bulunan: liufenghua
- 2024.02.12 [Commit#2fa74ecb](https://github.com/RVC-Boss/GPT-SoVITS/commit/2fa74ecb941db27d9015583a9be6962898d66730), [Commit#d82f6bbb](https://github.com/RVC-Boss/GPT-SoVITS/commit/d82f6bbb98ba725e6725dcee99b80ce71fb0bf28)
- İçerik: Faster Whisper ve FunASR mantığı optimize edildi, Hugging Face bağlantı sorunlarını önlemek için yansı indirmelere geçildi.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.02.15 [Commit#dd2c4d6d](https://github.com/RVC-Boss/GPT-SoVITS/commit/dd2c4d6d7121bf82d29d0f0e4d788f3b231997c8)
- İçerik: Eğitimde Çince deney adları desteklendi (önceki sürümlerde hata veriyordu).
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.02.15 [Commit#ccb9b08b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ccb9b08be3c58e102defcc94ff4fd609da9e27ee)~[Commit#895fde46](https://github.com/RVC-Boss/GPT-SoVITS/commit/895fde46e420040ed26aaf0c5b7e99359d9b199b)
- İçerik: DPO eğitimi zorunlu olmaktan çıkarılıp seçmeli hale getirildi. Seçildiğinde batch boyutu otomatik yarıya indiriliyor. Çıkarım WebUI'sında yeni parametrelerin iletilmemesi sorunu düzeltildi.
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.02.15 [Commit#7b0c3c67](https://github.com/RVC-Boss/GPT-SoVITS/commit/7b0c3c676495c64b2064aa472bff14b5c06206a5)
- İçerik: Çince ön uç hataları düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.02.16 [PR#499](https://github.com/RVC-Boss/GPT-SoVITS/pull/499)
- İçerik: Referans metin olmadan giriş yapma desteği eklendi.
- Tür: Özellik
- Katkıda Bulunan: Watchtower-Liu
- İlgili: [Issue#475](https://github.com/RVC-Boss/GPT-SoVITS/issues/475)
- 2024.02.17 [PR#509](https://github.com/RVC-Boss/GPT-SoVITS/pull/509), [PR#507](https://github.com/RVC-Boss/GPT-SoVITS/pull/507), [PR#532](https://github.com/RVC-Boss/GPT-SoVITS/pull/532), [PR#556](https://github.com/RVC-Boss/GPT-SoVITS/pull/556), [PR#559](https://github.com/RVC-Boss/GPT-SoVITS/pull/559)
- İçerik: Çince ve Japonca ön uç işleme optimizasyonları yapıldı.
- Tür: Optimizasyon
- Katkıda Bulunan: KamioRinn, v3cun
- 2024.02.17 [PR#510](https://github.com/RVC-Boss/GPT-SoVITS/pull/511), [PR#511](https://github.com/RVC-Boss/GPT-SoVITS/pull/511)
- İçerik: Colab genel URL sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: ChanningWang2018, RVC-Boss
- 2024.02.21 [PR#557](https://github.com/RVC-Boss/GPT-SoVITS/pull/557)
- İçerik: Mac CPU çıkarımında MPS yerine CPU kullanılarak performans iyileştirildi.
- Tür: Optimizasyon
- Katkıda Bulunan: XXXXRT666
- 2024.02.21 [Commit#6da486c1](https://github.com/RVC-Boss/GPT-SoVITS/commit/6da486c15d09e3d99fa42c5e560aaac56b6b4ce1), [Commit#5a171773](https://github.com/RVC-Boss/GPT-SoVITS/commit/5a17177342d2df1e11369f2f4f58d34a3feb1a35)
- İçerik: Veri işleme sırasında gürültü azaltma seçeneği eklendi (sadece 16kHz örnekleme hızını korur, sadece yüksek arka plan gürültüsü varsa kullanılması önerilir).
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.02.28 [PR#573](https://github.com/RVC-Boss/GPT-SoVITS/pull/573)
- İçerik: Mac'te CPU çıkarımının düzgün çalışması için `is_half` kontrolü düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: XXXXRT666
- 2024.02.28 [PR#610](https://github.com/RVC-Boss/GPT-SoVITS/pull/610)
- İçerik: UVR5 reverb kaldırma modelinde ayarların ters olması sorunu düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: Yuze Wang
## 202403
- 2024.03.06 [PR#675](https://github.com/RVC-Boss/GPT-SoVITS/pull/675)
- İçerik: CUDA yokken Faster Whisper için otomatik CPU çıkarımı etkinleştirildi
- Tür: Optimizasyon
- Katkıda Bulunan: ShiroDoMain
- 2024.03.06 [Commit#616be20d](https://github.com/RVC-Boss/GPT-SoVITS/commit/616be20db3cf94f1cd663782fea61b2370704193)
- İçerik: Faster Whisper Çince olmayan ASR kullanırken artık Çince FunASR modelini önceden indirmeye gerek yok
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss
- 2024.03.09 [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- İçerik: Çıkarım hızı %50 iyileştirildi (RTX3090 + PyTorch 2.2.1 + CU11.8 + Win10 + Py39 ortamında test edildi)
- Tür: Optimizasyon
- Katkıda Bulunan: GoHomeToMacDonal
- 2024.03.10 [PR#721](https://github.com/RVC-Boss/GPT-SoVITS/pull/721)
- İçerik: Hızlı çıkarım dalı 'fast_inference_' eklendi
- Tür: Özellik
- Katkıda Bulunan: ChasonJiang
- 2024.03.13 [PR#761](https://github.com/RVC-Boss/GPT-SoVITS/pull/761)
- İçerik: CPU ile eğitim desteği eklendi, macOS'ta CPU kullanarak eğitim yapılabilir
- Tür: Özellik
- Katkıda Bulunan: Lion-Wu
- 2024.03.19 [PR#804](https://github.com/RVC-Boss/GPT-SoVITS/pull/804), [PR#812](https://github.com/RVC-Boss/GPT-SoVITS/pull/812), [PR#821](https://github.com/RVC-Boss/GPT-SoVITS/pull/821)
- İçerik: İngilizce metin ön uç iyileştirmeleri
- Tür: Optimizasyon
- Katkıda Bulunan: KamioRinn
- 2024.03.30 [PR#894](https://github.com/RVC-Boss/GPT-SoVITS/pull/894)
- İçerik: API formatı geliştirildi
- Tür: Optimizasyon
- Katkıda Bulunan: KamioRinn
## 202404
- 2024.04.03 [PR#917](https://github.com/RVC-Boss/GPT-SoVITS/pull/917)
- İçerik: UVR5 WebUI'da FFmpeg komut dizgisi biçimlendirmesi düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: StaryLan
## 202405
- 2024.05.02 [PR#953](https://github.com/RVC-Boss/GPT-SoVITS/pull/953)
- İçerik: SoVITS eğitiminde VQ'nun dondurulmamasından kaynaklanan kalite düşüşü sorunu çözüldü
- Tür: Düzeltme
- Katkıda Bulunan: hcwu1993
- İlgili: [Issue#747](https://github.com/RVC-Boss/GPT-SoVITS/issues/747)
- 2024.05.19 [PR#1102](https://github.com/RVC-Boss/GPT-SoVITS/pull/1102)
- İçerik: Eğitim verisi işleme sırasında desteklenmeyen diller için hata mesajı eklendi
- Tür: Optimizasyon
- Katkıda Bulunan: StaryLan
- 2024.05.27 [PR#1132](https://github.com/RVC-Boss/GPT-SoVITS/pull/1132)
- İçerik: Hubert çıkarım hatası düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: XXXXRT666
## 202406
- 2024.06.06 [Commit#99f09c8b](https://github.com/RVC-Boss/GPT-SoVITS/commit/99f09c8bdc155c1f4272b511940717705509582a)
- İçerik: WebUI'da GPT ince ayarında Çince metinlerin BERT özelliklerinin okunmaması nedeniyle çıkarım tutarsızlığı ve kalite düşüşü sorunu düzeltildi
**Uyarı: Daha önce büyük miktarda veriyle ince ayar yaptıysanız, kaliteyi artırmak için modeli yeniden ayarlamanız önerilir**
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.06.07 [PR#1159](https://github.com/RVC-Boss/GPT-SoVITS/pull/1159)
- İçerik: `s2_train.py` dosyasında SoVITS eğitim ilerleme çubuğu mantığı düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: pengzhendong
- 2024.06.10 [Commit#501a74ae](https://github.com/RVC-Boss/GPT-SoVITS/commit/501a74ae96789a26b48932babed5eb4e9483a232)
- İçerik: UVR5 MDXNet'in FFmpeg çağrılarında boşluk içeren yollarla uyumlu olması için dize biçimlendirme düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.06.10 [PR#1168](https://github.com/RVC-Boss/GPT-SoVITS/pull/1168), [PR#1169](https://github.com/RVC-Boss/GPT-SoVITS/pull/1169)
- İçerik: Saf noktalama işaretleri ve çoklu noktalama işaretli metin girişi işleme mantığı iyileştirildi
- Tür: Düzeltme
- Katkıda Bulunan: XXXXRT666
- İlgili: [Issue#1165](https://github.com/RVC-Boss/GPT-SoVITS/issues/1165)
- 2024.06.13 [Commit#db506705](https://github.com/RVC-Boss/GPT-SoVITS/commit/db50670598f0236613eefa6f2d5a23a271d82041)
- İçerik: CPU çıkarımında varsayılan batch boyutu ondalık sorunu düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.06.28 [PR#1258](https://github.com/RVC-Boss/GPT-SoVITS/pull/1258), [PR#1265](https://github.com/RVC-Boss/GPT-SoVITS/pull/1265), [PR#1267](https://github.com/RVC-Boss/GPT-SoVITS/pull/1267)
- İçerik: Gürültü azaltma veya ASR işlemi sırasında istisna oluştuğunda bekleyen tüm ses dosyalarının kapanması sorunu düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: XXXXRT666
- 2024.06.29 [Commit#a208698e](https://github.com/RVC-Boss/GPT-SoVITS/commit/a208698e775155efc95b187b746d153d0f2847ca)
- İçerik: Çoklu GPU eğitiminde çoklu işlem kayıt mantığı düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2024.06.29 [PR#1251](https://github.com/RVC-Boss/GPT-SoVITS/pull/1251)
- İçerik: Yinelenen `my_utils.py` dosyası kaldırıldı
- Tür: Optimizasyon
- Katkıda Bulunan: aoguai
- İlgili: [Issue#1189](https://github.com/RVC-Boss/GPT-SoVITS/issues/1189)
## 202407
- 2024.07.06 [PR#1253](https://github.com/RVC-Boss/GPT-SoVITS/pull/1253)
- İçerik: Noktalama işaretlerine göre bölme işlemi sırasında ondalık sayıların bölünmesi sorunu düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: aoguai
- 2024.07.06 [Commit#b0786f29](https://github.com/RVC-Boss/GPT-SoVITS/commit/b0786f2998f1b2fce6678434524b4e0e8cc716f5)
- İçerik: Hızlandırılmış çıkarım kodu doğrulandı ve ana dal ile birleştirildi. Temel sürümle aynı çıkarım etkisi garanti edilirken referans metni olmayan modda da hızlandırılmış çıkarım destekleniyor
- Tür: Optimizasyon
- Katkıda Bulunan: RVC-Boss, GoHomeToMacDonal
- İlgili: [PR#672](https://github.com/RVC-Boss/GPT-SoVITS/pull/672)
- 2024.07.13 [PR#1294](https://github.com/RVC-Boss/GPT-SoVITS/pull/1294), [PR#1298](https://github.com/RVC-Boss/GPT-SoVITS/pull/1298)
- İçerik: i18n taraması yeniden düzenlendi ve çok dilli yapılandırma dosyaları güncellendi
- Tür: Dokümantasyon
- Katkıda Bulunan: StaryLan
- 2024.07.13 [PR#1299](https://github.com/RVC-Boss/GPT-SoVITS/pull/1299)
- İçerik: Kullanıcı dosya yollarındaki son eğik çizgilerin neden olduğu komut satırı hataları düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: XXXXRT666
- 2024.07.19 [PR#756](https://github.com/RVC-Boss/GPT-SoVITS/pull/756)
- İçerik: GPT eğitiminde özel `bucket_sampler` kullanılırken eğitim adımlarında tutarsızlık sorunu düzeltildi
- Tür: Düzeltme
- Katkıda Bulunan: huangxu1991
- 2024.07.23 [Commit#9588a3c5](https://github.com/RVC-Boss/GPT-SoVITS/commit/9588a3c52d9ebdb20b3c5d74f647d12e7c1171c2), [PR#1340](https://github.com/RVC-Boss/GPT-SoVITS/pull/1340)
- İçerik: Sentez sırasında konuşma hızı ayarlama özelliği eklendi (rastgeleliği sabitleme ve sadece hızı kontrol etme seçeneği dahil). Bu özellik `api.py` dosyasına eklendi
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss, 红血球AE3803
- 2024.07.27 [PR#1306](https://github.com/RVC-Boss/GPT-SoVITS/pull/1306), [PR#1356](https://github.com/RVC-Boss/GPT-SoVITS/pull/1356)
- İçerik: BS-RoFormer vokal eşlik ayırma modeli desteği eklendi.
- Tür: Yeni Özellik
- Katkıda Bulunan: KamioRinn
- 2024.07.27 [PR#1351](https://github.com/RVC-Boss/GPT-SoVITS/pull/1351)
- İçerik: Çince metin ön işleme iyileştirildi.
- Tür: Yeni Özellik
- Katkıda Bulunan: KamioRinn
## 202408 (V2 Sürümü)
- 2024.08.01 [PR#1355](https://github.com/RVC-Boss/GPT-SoVITS/pull/1355)
- İçerik: WebUI'de dosya işlerken yolların otomatik doldurulması.
- Tür: Chore
- Katkıda Bulunan: XXXXRT666
- 2024.08.01 [Commit#e62e9653](https://github.com/RVC-Boss/GPT-SoVITS/commit/e62e965323a60a76a025bcaa45268c1ddcbcf05c)
- İçerik: BS-Roformer için FP16 çıkarım desteği etkinleştirildi.
- Tür: Performans Optimizasyonu
- Katkıda Bulunan: RVC-Boss
- 2024.08.01 [Commit#bce451a2](https://github.com/RVC-Boss/GPT-SoVITS/commit/bce451a2d1641e581e200297d01f219aeaaf7299), [Commit#4c8b7612](https://github.com/RVC-Boss/GPT-SoVITS/commit/4c8b7612206536b8b4435997acb69b25d93acb78)
- İçerik: GPU tanıma mantığı optimize edildi, kullanıcıların girdiği rastgele GPU indekslerini işlemek için kullanıcı dostu mantık eklendi.
- Tür: Chore
- Katkıda Bulunan: RVC-Boss
- 2024.08.02 [Commit#ff6c193f](https://github.com/RVC-Boss/GPT-SoVITS/commit/ff6c193f6fb99d44eea3648d82ebcee895860a22)~[Commit#de7ee7c7](https://github.com/RVC-Boss/GPT-SoVITS/commit/de7ee7c7c15a2ec137feb0693b4ff3db61fad758)
- İçerik: **GPT-SoVITS V2 modeli eklendi.**
- Tür: Yeni Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.08.03 [Commit#8a101474](https://github.com/RVC-Boss/GPT-SoVITS/commit/8a101474b5a4f913b4c94fca2e3ca87d0771bae3)
- İçerik: FunASR kullanarak Kantonca ASR desteği eklendi.
- Tür: Yeni Özellik
- Katkıda Bulunan: RVC-Boss
- 2024.08.03 [PR#1387](https://github.com/RVC-Boss/GPT-SoVITS/pull/1387), [PR#1388](https://github.com/RVC-Boss/GPT-SoVITS/pull/1388)
- İçerik: UI ve zamanlama mantığı optimize edildi.
- Tür: Chore
- Katkıda Bulunan: XXXXRT666
- 2024.08.06 [PR#1404](https://github.com/RVC-Boss/GPT-SoVITS/pull/1404), [PR#987](https://github.com/RVC-Boss/GPT-SoVITS/pull/987), [PR#488](https://github.com/RVC-Boss/GPT-SoVITS/pull/488)
- İçerik: Çok sesli karakter işleme mantığı optimize edildi (Yalnızca V2).
- Tür: Düzeltme, Yeni Özellik
- Katkıda Bulunan: KamioRinn, RVC-Boss
- 2024.08.13 [PR#1422](https://github.com/RVC-Boss/GPT-SoVITS/pull/1422)
- İçerik: Yalnızca bir referans ses yüklenebilme hatası düzeltildi; eksik dosyalar için uyarıılır pencereleriyle veri seti doğrulama eklendi.
- Tür: Düzeltme, Chore
- Katkıda Bulunan: XXXXRT666
- 2024.08.20 [Issue#1508](https://github.com/RVC-Boss/GPT-SoVITS/issues/1508)
- İçerik: Yukarı akış LangSegment kütüphanesi artık SSML etiketleri kullanarak sayıları, telefon numaralarını, tarihleri ve saatleri optimize ediyor.
- Tür: Yeni Özellik
- Katkıda Bulunan: juntaosun
- 2024.08.20 [PR#1503](https://github.com/RVC-Boss/GPT-SoVITS/pull/1503)
- İçerik: API düzeltildi ve optimize edildi.
- Tür: Düzeltme
- Katkıda Bulunan: KamioRinn
- 2024.08.20 [PR#1490](https://github.com/RVC-Boss/GPT-SoVITS/pull/1490)
- İçerik: `fast_inference` dalı ana dala birleştirildi.
- Tür: Yeniden Yapılandırma
- Katkıda Bulunan: ChasonJiang
- 2024.08.21 **GPT-SoVITS V2 sürümü resmi olarak yayınlandı.**
## 202502 (V3 Sürümü)
- 2025.02.11 [Commit#ed207c4b](https://github.com/RVC-Boss/GPT-SoVITS/commit/ed207c4b879d5296e9be3ae5f7b876729a2c43b8)~[Commit#6e2b4918](https://github.com/RVC-Boss/GPT-SoVITS/commit/6e2b49186c5b961f0de41ea485d398dffa9787b4)
- İçerik: **İnce ayar için 14GB VRAM gerektiren GPT-SoVITS V3 modeli eklendi.**
- Tür: Yeni Özellik ([Wiki](https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)) referans)
- Katkıda Bulunan: RVC-Boss
- 2025.02.12 [PR#2032](https://github.com/RVC-Boss/GPT-SoVITS/pull/2032)
- İçerik: Çok dilli proje dokümantasyonu güncellendi.
- Tür: Dokümantasyon
- Katkıda Bulunan: StaryLan
- 2025.02.12 [PR#2033](https://github.com/RVC-Boss/GPT-SoVITS/pull/2033)
- İçerik: Japonca dokümantasyon güncellendi.
- Tür: Dokümantasyon
- Katkıda Bulunan: Fyphen
- 2025.02.12 [PR#2010](https://github.com/RVC-Boss/GPT-SoVITS/pull/2010)
- İçerik: Dikkat hesaplama mantığı optimize edildi.
- Tür: Performans Optimizasyonu
- Katkıda Bulunan: wzy3650
- 2025.02.12 [PR#2040](https://github.com/RVC-Boss/GPT-SoVITS/pull/2040)
- İçerik: İnce ayar için gradyan kontrol noktası desteği eklendi (12GB VRAM gerektirir).
- Tür: Yeni Özellik
- Katkıda Bulunan: Kakaru Hayate
- 2025.02.14 [PR#2047](https://github.com/RVC-Boss/GPT-SoVITS/pull/2047), [PR#2062](https://github.com/RVC-Boss/GPT-SoVITS/pull/2062), [PR#2073](https://github.com/RVC-Boss/GPT-SoVITS/pull/2073)
- İçerik: Yeni dil bölümleme aracına geçildi, çok dilli karışık metin bölme stratejisi iyileştirildi, sayı ve İngilizce işleme mantığı optimize edildi.
- Tür: Yeni Özellik
- Katkıda Bulunan: KamioRinn
- 2025.02.23 [Commit#56509a17](https://github.com/RVC-Boss/GPT-SoVITS/commit/56509a17c918c8d149c48413a672b8ddf437495b)~[Commit#514fb692](https://github.com/RVC-Boss/GPT-SoVITS/commit/514fb692db056a06ed012bc3a5bca2a5b455703e)
- İçerik: **GPT-SoVITS V3 modeli artık LoRA eğitimini destekliyor (ince ayar için 8GB GPU Belleği gerektirir).**
- Tür: Yeni Özellik
- Katkıda Bulunan: RVC-Boss
- 2025.02.23 [PR#2078](https://github.com/RVC-Boss/GPT-SoVITS/pull/2078)
- İçerik: Vokal ve enstrüman ayırma için Mel Band Roformer model desteği eklendi.
- Tür: Yeni Özellik
- Katkıda Bulunan: Sucial
- 2025.02.26 [PR#2112](https://github.com/RVC-Boss/GPT-SoVITS/pull/2112), [PR#2114](https://github.com/RVC-Boss/GPT-SoVITS/pull/2114)
- İçerik: Çince yollarda MeCab hatası düzeltildi (özel olarak Japonca/Korece veya çok dilli metin bölme için).
- Tür: Düzeltme
- Katkıda Bulunan: KamioRinn
- 2025.02.27 [Commit#92961c3f](https://github.com/RVC-Boss/GPT-SoVITS/commit/92961c3f68b96009ff2cd00ce614a11b6c4d026f)~[Commit#250b1c73](https://github.com/RVC-Boss/GPT-SoVITS/commit/250b1c73cba60db18148b21ec5fbce01fd9d19bc)
- İçerik: V3 modeliyle 24K ses üretirken "boğuk" ses sorununu hafifletmek için **24kHz'den 48kHz'e ses süper çözünürlük modelleri eklendi**.
- Tür: Yeni Özellik
- Katkıda Bulunan: RVC-Boss
- İlgili: [Issue#2085](https://github.com/RVC-Boss/GPT-SoVITS/issues/2085), [Issue#2117](https://github.com/RVC-Boss/GPT-SoVITS/issues/2117)
- 2025.02.28 [PR#2123](https://github.com/RVC-Boss/GPT-SoVITS/pull/2123)
- İçerik: Çok dilli proje dokümantasyonu güncellendi.
- Tür: Dokümantasyon
- Katkıda Bulunan: StaryLan
- 2025.02.28 [PR#2122](https://github.com/RVC-Boss/GPT-SoVITS/pull/2122)
- İçerik: Model tanımlayamadığında kısa CJK karakterleri için kural tabanlı tespit uygulandı.
- Tür: Düzeltme
- Katkıda Bulunan: KamioRinn
- İlgili: [Issue#2116](https://github.com/RVC-Boss/GPT-SoVITS/issues/2116)
- 2025.02.28 [Commit#c38b1690](https://github.com/RVC-Boss/GPT-SoVITS/commit/c38b16901978c1db79491e16905ea3a37a7cf686), [Commit#a32a2b89](https://github.com/RVC-Boss/GPT-SoVITS/commit/a32a2b893436fad56cc82409121c7fa36a1815d5)
- İçerik: Sentez hızını kontrol etmek için konuşma hızı parametresi eklendi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2025.02.28 **GPT-SoVITS V3 resmi olarak yayınlandı**.
## 202503
- 2025.03.31 [PR#2236](https://github.com/RVC-Boss/GPT-SoVITS/pull/2236)
- İçerik: Bağımlılıkların yanlış sürümlerinden kaynaklanan sorunlar düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: XXXXRT666
- İlgili:
- PyOpenJTalk: [Issue#1131](https://github.com/RVC-Boss/GPT-SoVITS/issues/1131), [Issue#2231](https://github.com/RVC-Boss/GPT-SoVITS/issues/2231), [Issue#2233](https://github.com/RVC-Boss/GPT-SoVITS/issues/2233).
- ONNX: [Issue#492](https://github.com/RVC-Boss/GPT-SoVITS/issues/492), [Issue#671](https://github.com/RVC-Boss/GPT-SoVITS/issues/671), [Issue#1192](https://github.com/RVC-Boss/GPT-SoVITS/issues/1192), [Issue#1819](https://github.com/RVC-Boss/GPT-SoVITS/issues/1819), [Issue#1841](https://github.com/RVC-Boss/GPT-SoVITS/issues/1841).
- Pydantic: [Issue#2230](https://github.com/RVC-Boss/GPT-SoVITS/issues/2230), [Issue#2239](https://github.com/RVC-Boss/GPT-SoVITS/issues/2239).
- PyTorch-Lightning: [Issue#2174](https://github.com/RVC-Boss/GPT-SoVITS/issues/2174).
- 2025.03.31 [PR#2241](https://github.com/RVC-Boss/GPT-SoVITS/pull/2241)
- İçerik: **SoVITS v3 için paralel çıkarım etkinleştirildi.**
- Tür: Yeni Özellik
- Katkıda Bulunan: ChasonJiang
- Diğer küçük hatalar düzeltildi.
- ONNX çalışma zamanı GPU çıkarım desteği için entegre paket düzeltmeleri:
- Tür: Düzeltme
- Detaylar:
- G2PW içindeki ONNX modelleri CPU'dan GPU çıkarımına geçirildi, CPU darboğazı önemli ölçüde azaltıldı;
- foxjoy yankı giderme modeli artık GPU çıkarımını destekliyor.
## 202504 (V4 Sürümü)
- 2025.04.01 [Commit#6a60e5ed](https://github.com/RVC-Boss/GPT-SoVITS/commit/6a60e5edb1817af4a61c7a5b196c0d0f1407668f)
- İçerik: SoVITS v3 paralel çıkarımı kilit açıldı; asenkron model yükleme mantığı düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2025.04.07 [PR#2255](https://github.com/RVC-Boss/GPT-SoVITS/pull/2255)
- İçerik: Ruff ile kod biçimlendirme; G2PW bağlantısı güncellendi.
- Tür: Stil
- Katkıda Bulunan: XXXXRT666
- 2025.04.15 [PR#2290](https://github.com/RVC-Boss/GPT-SoVITS/pull/2290)
- İçerik: Dokümantasyon temizlendi; Python 3.11 desteği eklendi; yükleyiciler güncellendi.
- Tür: Chore
- Katkıda Bulunan: XXXXRT666
- 2025.04.20 [PR#2300](https://github.com/RVC-Boss/GPT-SoVITS/pull/2300)
- İçerik: Colab, kurulum dosyaları ve model indirmeleri güncellendi.
- Tür: Chore
- Katkıda Bulunan: XXXXRT666
- 2025.04.20 [Commit#e0c452f0](https://github.com/RVC-Boss/GPT-SoVITS/commit/e0c452f0078e8f7eb560b79a54d75573fefa8355)~[Commit#9d481da6](https://github.com/RVC-Boss/GPT-SoVITS/commit/9d481da610aa4b0ef8abf5651fd62800d2b4e8bf)
- İçerik: **GPT-SoVITS V4 modeli eklendi.**
- Tür: Yeni Özellik
- Katkıda Bulunan: RVC-Boss
- 2025.04.21 [Commit#8b394a15](https://github.com/RVC-Boss/GPT-SoVITS/commit/8b394a15bce8e1d85c0b11172442dbe7a6017ca2)~[Commit#bc2fe5ec](https://github.com/RVC-Boss/GPT-SoVITS/commit/bc2fe5ec86536c77bb3794b4be263ac87e4fdae6), [PR#2307](https://github.com/RVC-Boss/GPT-SoVITS/pull/2307)
- İçerik: V4 için paralel çıkarım etkinleştirildi.
- Tür: Yeni Özellik
- Katkıda Bulunan: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#7405427a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7405427a0ab2a43af63205df401fd6607a408d87)~[Commit#590c83d7](https://github.com/RVC-Boss/GPT-SoVITS/commit/590c83d7667c8d4908f5bdaf2f4c1ba8959d29ff), [PR#2309](https://github.com/RVC-Boss/GPT-SoVITS/pull/2309)
- İçerik: Model sürümü parametre aktarımı düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss, ChasonJiang
- 2025.04.22 [Commit#fbdab94e](https://github.com/RVC-Boss/GPT-SoVITS/commit/fbdab94e17d605d85841af6f94f40a45976dd1d9), [PR#2310](https://github.com/RVC-Boss/GPT-SoVITS/pull/2310)
- İçerik: Numpy ve Numba sürüm uyumsuzluğu sorunu düzeltildi; librosa sürümü güncellendi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss, XXXXRT666
- İlgili: [Issue#2308](https://github.com/RVC-Boss/GPT-SoVITS/issues/2308)
- **2025.04.22 GPT-SoVITS V4 resmi olarak yayınlandı**.
- 2025.04.22 [PR#2311](https://github.com/RVC-Boss/GPT-SoVITS/pull/2311)
- İçerik: Gradio parametreleri güncellendi.
- Tür: Chore
- Katkıda Bulunan: XXXXRT666
- 2025.04.25 [PR#2322](https://github.com/RVC-Boss/GPT-SoVITS/pull/2322)
- İçerik: Colab/Kaggle notebook betikleri iyileştirildi.
- Tür: Chore
- Katkıda Bulunan: XXXXRT666
## 202505
- 2025.05.26 [PR#2351](https://github.com/RVC-Boss/GPT-SoVITS/pull/2351)
- İçerik: Docker ve Windows otomatik derleme betikleri iyileştirildi; ön işleme biçimlendirme eklendi.
- Tür: Chore
- Katkıda Bulunan: XXXXRT666
- 2025.05.26 [PR#2408](https://github.com/RVC-Boss/GPT-SoVITS/pull/2408)
- İçerik: Çok dilli metin bölme ve tanıma mantığı optimize edildi.
- Tür: Düzeltme
- Katkıda Bulunan: KamioRinn
- İlgili: [Issue#2404](https://github.com/RVC-Boss/GPT-SoVITS/issues/2404)
- 2025.05.26 [PR#2377](https://github.com/RVC-Boss/GPT-SoVITS/pull/2377)
- İçerik: SoVITS V3/V4 çıkarım hızını %10 artırmak için önbellekleme stratejileri uygulandı.
- Tür: Performans Optimizasyonu
- Katkıda Bulunan: Kakaru Hayate
- 2025.05.26 [Commit#4d9d56b1](https://github.com/RVC-Boss/GPT-SoVITS/commit/4d9d56b19638dc434d6eefd9545e4d8639a3e072), [Commit#8c705784](https://github.com/RVC-Boss/GPT-SoVITS/commit/8c705784c50bf438c7b6d0be33a9e5e3cb90e6b2), [Commit#fafe4e7f](https://github.com/RVC-Boss/GPT-SoVITS/commit/fafe4e7f120fba56c5f053c6db30aa675d5951ba)
- İçerik: Açıklama arayüzü uyarı ile güncellendi: her sayfa tamamlandıktan sonra "Metni Gönder"e tıklayın, aksi takdirde değişiklikler kaydedilmez.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
- 2025.05.29 [Commit#1934fc1e](https://github.com/RVC-Boss/GPT-SoVITS/commit/1934fc1e1b22c4c162bba1bbe7d7ebb132944cdc)
- İçerik: UVR5 ve ONNX yankı giderme modellerinde, FFmpeg'in orijinal yollarında boşluk bulunan MP3/M4A dosyalarını kodlarken oluşan hatalar düzeltildi.
- Tür: Düzeltme
- Katkıda Bulunan: RVC-Boss
## 202506 (V2Pro Serisi)
- 2025.06.03 [PR#2420](https://github.com/RVC-Boss/GPT-SoVITS/pull/2420)
- İçerik: Çok dilli proje dokümantasyonu güncellendi
- Tür: Dokümantasyon
- Katkıda Bulunan: StaryLan
- 2025.06.04 [PR#2417](https://github.com/RVC-Boss/GPT-SoVITS/pull/2417)
- İçerik: TorchScript ile V4 dışa aktarma desteği eklendi
- Tür: Özellik
- Katkıda Bulunan: L-jasmine
- 2025.06.04 [Commit#b7c0c5ca](https://github.com/RVC-Boss/GPT-SoVITS/commit/b7c0c5ca878bcdd419fd86bf80dba431a6653356)~[Commit#298ebb03](https://github.com/RVC-Boss/GPT-SoVITS/commit/298ebb03c5a719388527ae6a586c7ea960344e70)
- İçerik: **GPT-SoVITS V2Pro Serisi model eklendi (V2Pro, V2ProPlus)**
- Tür: Özellik
- Katkıda Bulunan: RVC-Boss
- 2025.06.05 [PR#2426](https://github.com/RVC-Boss/GPT-SoVITS/pull/2426)
- İçerik: `config/inference_webui` başlatma hatası düzeltildi
- Tür: Hata Düzeltme
- Katkıda Bulunan: StaryLan
- 2025.06.05 [PR#2427](https://github.com/RVC-Boss/GPT-SoVITS/pull/2427), [Commit#7d70852a](https://github.com/RVC-Boss/GPT-SoVITS/commit/7d70852a3f67c3b52e3a62857f8663d529efc8cd), [PR#2434](https://github.com/RVC-Boss/GPT-SoVITS/pull/2434)
- İçerik: Otomatik hassasiyet algılama mantığı optimize edildi; WebUI önyüz modüllerine katlanabilir özellik eklendi
- Tür: Yeni Özellik
- Katkıda Bulunanlar: XXXXRT666, RVC-Boss

View File

@@ -0,0 +1,459 @@
<div align="center">
<h1>GPT-SoVITS-WebUI</h1>
Güçlü Birkaç Örnekli Ses Dönüştürme ve Metinden Konuşmaya Web Arayüzü.<br><br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/RVC-Boss/GPT-SoVITS)
<a href="https://trendshift.io/repositories/7033" target="_blank"><img src="https://trendshift.io/api/badge/repositories/7033" alt="RVC-Boss%2FGPT-SoVITS | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
[![Python](https://img.shields.io/badge/python-3.10--3.12-blue?style=for-the-badge&logo=python)](https://www.python.org)
[![GitHub release](https://img.shields.io/github/v/release/RVC-Boss/gpt-sovits?style=for-the-badge&logo=github)](https://github.com/RVC-Boss/gpt-sovits/releases)
[![Train In Colab](https://img.shields.io/badge/Colab-Training-F9AB00?style=for-the-badge&logo=googlecolab)](https://colab.research.google.com/github/RVC-Boss/GPT-SoVITS/blob/main/Colab-WebUI.ipynb)
[![Huggingface](https://img.shields.io/badge/免费在线体验-free_online_demo-yellow.svg?style=for-the-badge&logo=huggingface)](https://lj1995-gpt-sovits-proplus.hf.space/)
[![Image Size](https://img.shields.io/docker/image-size/xxxxrt666/gpt-sovits/latest?style=for-the-badge&logo=docker)](https://hub.docker.com/r/xxxxrt666/gpt-sovits)
[![简体中文](https://img.shields.io/badge/简体中文-阅读文档-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e)
[![English](https://img.shields.io/badge/English-Read%20Docs-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://rentry.co/GPT-SoVITS-guide#/)
[![Change Log](https://img.shields.io/badge/Change%20Log-View%20Updates-blue?style=for-the-badge&logo=googledocs&logoColor=white)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/docs/en/Changelog_EN.md)
[![License](https://img.shields.io/badge/LICENSE-MIT-green.svg?style=for-the-badge&logo=opensourceinitiative)](https://github.com/RVC-Boss/GPT-SoVITS/blob/main/LICENSE)
[**English**](../../README.md) | [**中文简体**](../cn/README.md) | [**日本語**](../ja/README.md) | [**한국어**](../ko/README.md) | **Türkçe**
</div>
---
## Özellikler:
1. **Sıfır Örnekli Metinden Konuşmaya:** 5 saniyelik bir vokal örneği girin ve anında metinden konuşmaya dönüşümünü deneyimleyin.
2. **Birkaç Örnekli Metinden Konuşmaya:** Daha iyi ses benzerliği ve gerçekçiliği için modeli yalnızca 1 dakikalık eğitim verisiyle ince ayarlayın.
3. **Çapraz Dil Desteği:** Eğitim veri setinden farklı dillerde çıkarım, şu anda İngilizce, Japonca, Çince, Kantonca ve Koreceyi destekliyor.
4. **Web Arayüzü Araçları:** Entegre araçlar arasında vokal eşliğinde ayırma, otomatik eğitim seti segmentasyonu, Çince ASR ve metin etiketleme bulunur ve yeni başlayanların eğitim veri setleri ve GPT/SoVITS modelleri oluşturmalarına yardımcı olur.
**[Demo videomuzu](https://www.bilibili.com/video/BV12g4y1m7Uw) buradan izleyin!**
Görünmeyen konuşmacılar birkaç örnekli ince ayar demosu:
https://github.com/RVC-Boss/GPT-SoVITS/assets/129054828/05bee1fa-bdd8-4d85-9350-80c060ab47fb
**Kullanıcı Kılavuzu: [简体中文](https://www.yuque.com/baicaigongchang1145haoyuangong/ib3g1e) | [English](https://rentry.co/GPT-SoVITS-guide#/)**
## Kurulum
### Test Edilmiş Ortamlar
| Python Version | PyTorch Version | Device |
| -------------- | ---------------- | ------------- |
| Python 3.10 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.5.1 | CUDA 12.4 |
| Python 3.11 | PyTorch 2.7.0 | CUDA 12.8 |
| Python 3.9 | PyTorch 2.8.0dev | CUDA 12.8 |
| Python 3.9 | PyTorch 2.5.1 | Apple silicon |
| Python 3.11 | PyTorch 2.7.0 | Apple silicon |
| Python 3.9 | PyTorch 2.2.2 | CPU |
### Windows
Eğer bir Windows kullanıcısıysanız (win>=10 ile test edilmiştir), [entegre paketi indirin](https://huggingface.co/lj1995/GPT-SoVITS-windows-package/resolve/main/GPT-SoVITS-v3lora-20250228.7z?download=true) ve _go-webui.bat_ dosyasına çift tıklayarak GPT-SoVITS-WebUI'yi başlatın.
```pwsh
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pwsh -F install.ps1 --Device <CU126|CU128|CPU> --Source <HF|HF-Mirror|ModelScope> [--DownloadUVR5]
```
### Linux
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <CU126|CU128|ROCM|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### macOS
**Not: Mac'lerde GPU'larla eğitilen modeller, diğer cihazlarda eğitilenlere göre önemli ölçüde daha düşük kalitede sonuç verir, bu nedenle geçici olarak CPU'lar kullanıyoruz.**
Aşağıdaki komutları çalıştırarak programı yükleyin:
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
bash install.sh --device <MPS|CPU> --source <HF|HF-Mirror|ModelScope> [--download-uvr5]
```
### El ile Yükleme
#### Bağımlılıkları Yükleme
```bash
conda create -n GPTSoVits python=3.10
conda activate GPTSoVits
pip install -r extra-req.txt --no-deps
pip install -r requirements.txt
```
#### FFmpeg'i Yükleme
##### Conda Kullanıcıları
```bash
conda activate GPTSoVits
conda install ffmpeg
```
##### Ubuntu/Debian Kullanıcıları
```bash
sudo apt install ffmpeg
sudo apt install libsox-dev
```
##### Windows Kullanıcıları
[ffmpeg.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe) ve [ffprobe.exe](https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe) dosyalarını indirin ve GPT-SoVITS kök dizinine yerleştirin
[Visual Studio 2017](https://aka.ms/vs/17/release/vc_redist.x86.exe) ortamını yükleyin
##### MacOS Kullanıcıları
```bash
brew install ffmpeg
```
### GPT-SoVITS Çalıştırma (Docker Kullanarak)
#### Docker İmajı Seçimi
Kod tabanı hızla geliştiği halde Docker imajları daha yavaş yayınlandığı için lütfen şu adımları izleyin:
- En güncel kullanılabilir imaj etiketlerini görmek için [Docker Hub](https://hub.docker.com/r/xxxxrt666/gpt-sovits) adresini kontrol edin
- Ortamınıza uygun bir imaj etiketi seçin
- `Lite`, Docker imajında ASR modelleri ve UVR5 modellerinin **bulunmadığı** anlamına gelir. UVR5 modellerini manuel olarak indirebilirsiniz; ASR modelleri ise gerektiğinde program tarafından otomatik olarak indirilir
- Docker Compose sırasında, uygun mimariye (amd64 veya arm64) ait imaj otomatik olarak indirilir
- Docker Compose, mevcut dizindeki **tüm dosyaları** bağlayacaktır. Docker imajını kullanmadan önce lütfen proje kök dizinine geçin ve **en son kodu çekin**
- Opsiyonel: En güncel değişiklikleri almak için, sağlanan Dockerfile ile yerel olarak imajı kendiniz oluşturabilirsiniz
#### Ortam Değişkenleri
- `is_half`: Yarı hassasiyet (fp16) kullanımını kontrol eder. GPUnuz destekliyorsa, belleği azaltmak için `true` olarak ayarlayın.
#### Paylaşılan Bellek Yapılandırması
Windows (Docker Desktop) ortamında, varsayılan paylaşılan bellek boyutu düşüktür ve bu beklenmedik hatalara neden olabilir. Sistem belleğinize göre Docker Compose dosyasındaki `shm_size` değerini (örneğin `16g`) artırmanız önerilir.
#### Servis Seçimi
`docker-compose.yaml` dosyasında iki tür servis tanımlanmıştır:
- `GPT-SoVITS-CU126` ve `GPT-SoVITS-CU128`: Tüm özellikleri içeren tam sürüm.
- `GPT-SoVITS-CU126-Lite` ve `GPT-SoVITS-CU128-Lite`: Daha az bağımlılığa ve sınırlı işlevselliğe sahip hafif sürüm.
Belirli bir servisi Docker Compose ile çalıştırmak için şu komutu kullanın:
```bash
docker compose run --service-ports <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128>
```
#### Docker İmajını Yerel Olarak Oluşturma
Docker imajını kendiniz oluşturmak isterseniz şu komutu kullanın:
```bash
bash docker_build.sh --cuda <12.6|12.8> [--lite]
```
#### Çalışan Konteynere Erişim (Bash Shell)
Konteyner arka planda çalışırken, aşağıdaki komutla içine girebilirsiniz:
```bash
docker exec -it <GPT-SoVITS-CU126-Lite|GPT-SoVITS-CU128-Lite|GPT-SoVITS-CU126|GPT-SoVITS-CU128> bash
```
## Önceden Eğitilmiş Modeller
**Eğer `install.sh` başarıyla çalıştırılırsa, No.1,2,3 adımını atlayabilirsiniz.**
1. [GPT-SoVITS Models](https://huggingface.co/lj1995/GPT-SoVITS) üzerinden önceden eğitilmiş modelleri indirip `GPT_SoVITS/pretrained_models` dizinine yerleştirin.
2. [G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) üzerinden modeli indirip sıkıştırmayıın ve `G2PWModel` olarak yeniden adlandırın, ardından `GPT_SoVITS/text` dizinine yerleştirin. (Sadece Çince TTS için)
3. UVR5 (Vokal/Enstrümantal Ayrımı & Yankı Giderme) için, [UVR5 Weights](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/uvr5_weights) üzerinden modelleri indirip `tools/uvr5/uvr5_weights` dizinine yerleştirin.
- UVR5'te bs_roformer veya mel_band_roformer modellerini kullanıyorsanız, modeli ve ilgili yapılandırma dosyasını manuel olarak indirip `tools/UVR5/UVR5_weights` klasörüne yerleştirebilirsiniz. **Model dosyası ve yapılandırma dosyasının adı, uzantı dışında aynı olmalıdır**. Ayrıca, model ve yapılandırma dosyasının adlarında **"roformer"** kelimesi yer almalıdır, böylece roformer sınıfındaki bir model olarak tanınır.
- Model adı ve yapılandırma dosyası adı içinde **doğrudan model tipini belirtmek önerilir**. Örneğin: mel_mand_roformer, bs_roformer. Belirtilmezse, yapılandırma dosyasından özellikler karşılaştırılarak model tipi belirlenir. Örneğin, `bs_roformer_ep_368_sdr_12.9628.ckpt` modeli ve karşılık gelen yapılandırma dosyası `bs_roformer_ep_368_sdr_12.9628.yaml` bir çifttir. Aynı şekilde, `kim_mel_band_roformer.ckpt` ve `kim_mel_band_roformer.yaml` da bir çifttir.
4. Çince ASR için, [Damo ASR Model](https://modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/files), [Damo VAD Model](https://modelscope.cn/models/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/files) ve [Damo Punc Model](https://modelscope.cn/models/damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/files) üzerinden modelleri indirip `tools/asr/models` dizinine yerleştirin.
5. İngilizce veya Japonca ASR için, [Faster Whisper Large V3](https://huggingface.co/Systran/faster-whisper-large-v3) üzerinden modeli indirip `tools/asr/models` dizinine yerleştirin. Ayrıca, [diğer modeller](https://huggingface.co/Systran) benzer bir etki yaratabilir ve daha az disk alanı kaplayabilir.
## Veri Seti Formatı
TTS açıklama .list dosya formatı:
```
vocal_path|speaker_name|language|text
```
Dil sözlüğü:
- 'zh': Çince
- 'ja': Japonca
- 'en': İngilizce
- 'ko': Korece
- 'yue': Kantonca
Örnek:
```
D:\GPT-SoVITS\xxx/xxx.wav|xxx|en|I like playing Genshin.
```
## İnce Ayar ve Çıkarım
### WebUI'yi Açın
#### Entegre Paket Kullanıcıları
`go-webui.bat` dosyasına çift tıklayın veya `go-webui.ps1` kullanın.
V1'e geçmek istiyorsanız, `go-webui-v1.bat` dosyasına çift tıklayın veya `go-webui-v1.ps1` kullanın.
#### Diğerleri
```bash
python webui.py <dil(isteğe bağlı)>
```
V1'e geçmek istiyorsanız,
```bash
python webui.py v1 <dil(isteğe bağlı)>
```
veya WebUI'de manuel olarak sürüm değiştirin.
### İnce Ayar
#### Yol Otomatik Doldurma artık destekleniyor
1. Ses yolunu doldurun
2. Sesi küçük parçalara ayırın
3. Gürültü azaltma (isteğe bağlı)
4. ASR
5. ASR transkripsiyonlarını düzeltin
6. Bir sonraki sekmeye geçin ve modeli ince ayar yapın
### Çıkarım WebUI'sini Açın
#### Entegre Paket Kullanıcıları
`go-webui-v2.bat` dosyasına çift tıklayın veya `go-webui-v2.ps1` kullanın, ardından çıkarım webui'sini `1-GPT-SoVITS-TTS/1C-inference` adresinde açın.
#### Diğerleri
```bash
python GPT_SoVITS/inference_webui.py <dil(isteğe bağlı)>
```
VEYA
```bash
python webui.py
```
ardından çıkarım webui'sini `1-GPT-SoVITS-TTS/1C-inference` adresinde açın.
## V2 Sürüm Notları
Yeni Özellikler:
1. Korece ve Kantonca destekler
2. Optimize edilmiş metin ön yüzü
3. Önceden eğitilmiş model 2k saatten 5k saate kadar genişletildi
4. Düşük kaliteli referans sesler için geliştirilmiş sentez kalitesi
[detaylar burada](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v2%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
V1 ortamından V2'yi kullanmak için:
1. `pip install -r requirements.txt` ile bazı paketleri güncelleyin
2. github'dan en son kodları klonlayın.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main/gsv-v2final-pretrained) adresinden v2 önceden eğitilmiş modelleri indirin ve bunları `GPT_SoVITS/pretrained_models/gsv-v2final-pretrained` dizinine yerleştirin.
Ek olarak Çince V2: [G2PWModel.zip(HF)](https://huggingface.co/XXXXRT/GPT-SoVITS-Pretrained/resolve/main/G2PWModel.zip)| [G2PWModel.zip(ModelScope)](https://www.modelscope.cn/models/XXXXRT/GPT-SoVITS-Pretrained/resolve/master/G2PWModel.zip) (G2PW modellerini indirip, zipten çıkarıp, `G2PWModel` olarak yeniden adlandırıp `GPT_SoVITS/text` dizinine yerleştirin.)
## V3 Sürüm Notları
Yeni Özellikler:
1. **Tını benzerliği** daha yüksek olup, hedef konuşmacıyı yakınsamak için daha az eğitim verisi gerekmektedir (tını benzerliği, base model doğrudan kullanılacak şekilde fine-tuning yapılmadan önemli ölçüde iyileştirilmiştir).
2. GPT modeli daha **kararlı** hale geldi, tekrarlar ve atlamalar azaldı ve **daha zengin duygusal ifadeler** ile konuşma üretmek daha kolay hale geldi.
[daha fazla detay](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
V2 ortamında V3 kullanımı:
1. `pip install -r requirements.txt` ile bazı paketleri güncelleyin.
2. GitHub'dan en son kodları klonlayın.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) üzerinden v3 önceden eğitilmiş modellerini (s1v3.ckpt, s2Gv3.pth ve models--nvidia--bigvgan_v2_24khz_100band_256x klasörünü) indirin ve `GPT_SoVITS/pretrained_models` dizinine yerleştirin.
ek: Ses Süper Çözünürlük modeli için [nasıl indirileceği](../../tools/AP_BWE_main/24kto48k/readme.txt) hakkında bilgi alabilirsiniz.
## V4 Sürüm Notları
Yeni Özellikler:
1. **V4, V3'te görülen non-integer upsample işleminden kaynaklanan metalik ses sorununu düzeltti ve sesin boğuklaşmasını önlemek için doğrudan 48kHz ses çıktısı sunar (V3 sadece 24kHz destekler)**. Yazar, V4'ün V3'ün yerine geçebileceğini belirtmiştir ancak daha fazla test yapılması gerekmektedir.
[Daha fazla bilgi](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v3v4%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)>)
V1/V2/V3 ortamından V4'e geçiş:
1. Bazı bağımlılıkları güncellemek için `pip install -r requirements.txt` komutunu çalıştırın.
2. GitHub'dan en son kodları klonlayın.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) üzerinden V4 ön eğitilmiş modelleri indirin (`gsv-v4-pretrained/s2v4.ckpt` ve `gsv-v4-pretrained/vocoder.pth`) ve bunları `GPT_SoVITS/pretrained_models` dizinine koyun.
## V2Pro Sürüm Notları
Yeni Özellikler:
1. **V2 ile karşılaştırıldığında biraz daha yüksek VRAM kullanımı sağlar ancak V4'ten daha iyi performans gösterir; aynı donanım maliyeti ve hız avantajını korur**.
[Daha fazla bilgi](<https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90features-(%E5%90%84%E7%89%88%E6%9C%AC%E7%89%B9%E6%80%A7)>)
2. V1/V2 ve V2Pro serisi benzer özelliklere sahipken, V3/V4 de yakın işlevleri paylaşır. Ortalama kalite düşük olan eğitim setleriyle V1/V2/V2Pro iyi sonuçlar verebilir ama V3/V4 veremez. Ayrıca, V3/V4ün ürettiği ses tonu genel eğitim setine değil, referans ses örneğine daha çok benzemektedir.
V1/V2/V3/V4 ortamından V2Pro'ya geçiş:
1. Bazı bağımlılıkları güncellemek için `pip install -r requirements.txt` komutunu çalıştırın.
2. GitHub'dan en son kodları klonlayın.
3. [huggingface](https://huggingface.co/lj1995/GPT-SoVITS/tree/main) üzerinden V2Pro ön eğitilmiş modelleri indirin (`v2Pro/s2Dv2Pro.pth`, `v2Pro/s2Gv2Pro.pth`, `v2Pro/s2Dv2ProPlus.pth`, `v2Pro/s2Gv2ProPlus.pth`, ve `sv/pretrained_eres2netv2w24s4ep4.ckpt`) ve bunları `GPT_SoVITS/pretrained_models` dizinine koyun.
## Yapılacaklar Listesi
- [x] **Yüksek Öncelikli:**
- [x] Japonca ve İngilizceye yerelleştirme.
- [x] Kullanıcı kılavuzu.
- [x] Japonca ve İngilizce veri seti ince ayar eğitimi.
- [ ] **Özellikler:**
- [x] Sıfır örnekli ses dönüştürme (5s) / birkaç örnekli ses dönüştürme (1dk).
- [x] Metinden konuşmaya konuşma hızı kontrolü.
- [ ] ~~Gelişmiş metinden konuşmaya duygu kontrolü.~~
- [ ] SoVITS token girdilerini kelime dağarcığı olasılık dağılımına değiştirme denemesi.
- [x] İngilizce ve Japonca metin ön ucunu iyileştirme.
- [ ] Küçük ve büyük boyutlu metinden konuşmaya modelleri geliştirme.
- [x] Colab betikleri.
- [ ] Eğitim veri setini genişletmeyi dene (2k saat -> 10k saat).
- [x] daha iyi sovits temel modeli (geliştirilmiş ses kalitesi)
- [ ] model karışımı
## (Ekstra) Komut satırından çalıştırma yöntemi
UVR5 için Web Arayüzünü açmak için komut satırını kullanın
```bash
python tools/uvr5/webui.py "<infer_device>" <is_half> <webui_port_uvr5>
```
<!-- Bir tarayıcı açamıyorsanız, UVR işleme için aşağıdaki formatı izleyin,Bu ses işleme için mdxnet kullanıyor
```
python mdxnet.py --model --input_root --output_vocal --output_ins --agg_level --format --device --is_half_precision
``` -->
Veri setinin ses segmentasyonu komut satırı kullanılarak bu şekilde yapılır
```bash
python audio_slicer.py \
--input_path "<orijinal_ses_dosyası_veya_dizininin_yolu>" \
--output_root "<alt_bölümlere_ayrılmış_ses_kliplerinin_kaydedileceği_dizin>" \
--threshold <ses_eşiği> \
--min_length <her_bir_alt_klibin_minimum_süresi> \
--min_interval <bitişik_alt_klipler_arasındaki_en_kısa_zaman_aralığı>
--hop_size <ses_eğrisini_hesaplamak_için_adım_boyutu>
```
Veri seti ASR işleme komut satırı kullanılarak bu şekilde yapılır (Yalnızca Çince)
```bash
python tools/asr/funasr_asr.py -i <girdi> -o <çıktı>
```
ASR işleme Faster_Whisper aracılığıyla gerçekleştirilir (Çince dışındaki ASR işaretleme)
(İlerleme çubukları yok, GPU performansı zaman gecikmelerine neden olabilir)
```bash
python ./tools/asr/fasterwhisper_asr.py -i <girdi> -o <çıktı> -l <dil>
```
Özel bir liste kaydetme yolu etkinleştirildi
## Katkı Verenler
Özellikle aşağıdaki projelere ve katkıda bulunanlara teşekkür ederiz:
### Teorik Araştırma
- [ar-vits](https://github.com/innnky/ar-vits)
- [SoundStorm](https://github.com/yangdongchao/SoundStorm/tree/master/soundstorm/s1/AR)
- [vits](https://github.com/jaywalnut310/vits)
- [TransferTTS](https://github.com/hcy71o/TransferTTS/blob/master/models.py#L556)
- [contentvec](https://github.com/auspicious3000/contentvec/)
- [hifi-gan](https://github.com/jik876/hifi-gan)
- [fish-speech](https://github.com/fishaudio/fish-speech/blob/main/tools/llama/generate.py#L41)
- [f5-TTS](https://github.com/SWivid/F5-TTS/blob/main/src/f5_tts/model/backbones/dit.py)
- [shortcut flow matching](https://github.com/kvfrans/shortcut-models/blob/main/targets_shortcut.py)
### Önceden Eğitilmiş Modeller
- [Chinese Speech Pretrain](https://github.com/TencentGameMate/chinese_speech_pretrain)
- [Chinese-Roberta-WWM-Ext-Large](https://huggingface.co/hfl/chinese-roberta-wwm-ext-large)
- [BigVGAN](https://github.com/NVIDIA/BigVGAN)
- [eresnetv2](https://modelscope.cn/models/iic/speech_eres2netv2w24s4ep4_sv_zh-cn_16k-common)
### Tahmin İçin Metin Ön Ucu
- [paddlespeech zh_normalization](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/zh_normalization)
- [split-lang](https://github.com/DoodleBears/split-lang)
- [g2pW](https://github.com/GitYCC/g2pW)
- [pypinyin-g2pW](https://github.com/mozillazg/pypinyin-g2pW)
- [paddlespeech g2pw](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/g2pw)
### WebUI Araçları
- [ultimatevocalremovergui](https://github.com/Anjok07/ultimatevocalremovergui)
- [audio-slicer](https://github.com/openvpi/audio-slicer)
- [SubFix](https://github.com/cronrpc/SubFix)
- [FFmpeg](https://github.com/FFmpeg/FFmpeg)
- [gradio](https://github.com/gradio-app/gradio)
- [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
- [FunASR](https://github.com/alibaba-damo-academy/FunASR)
- [AP-BWE](https://github.com/yxlu-0102/AP-BWE)
@Naozumi520'ye Kantonca eğitim setini sağladığı ve Kantonca ile ilgili bilgiler konusunda rehberlik ettiği için minnettarım.
## Tüm katkıda bulunanlara çabaları için teşekkürler
<a href="https://github.com/RVC-Boss/GPT-SoVITS/graphs/contributors" target="_blank">
<img src="https://contrib.rocks/image?repo=RVC-Boss/GPT-SoVITS" />
</a>