初始化项目,由ModelHub XC社区提供模型

Model: KandirResearch/DarijaTTS-v0.1-500M
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-13 21:35:39 +08:00
commit 0bf626d2f0
13 changed files with 207988 additions and 0 deletions

37
.gitattributes vendored Normal file
View File

@@ -0,0 +1,37 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text
unsloth.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

89
README.md Normal file
View File

@@ -0,0 +1,89 @@
---
library_name: transformers
tags:
- unsloth
- trl
- sft
- tts
license: apache-2.0
language:
- ary
datasets:
- KandirResearch/DarijaTTS-clean
base_model:
- OuteAI/OuteTTS-0.2-500M
pipeline_tag: text-to-speech
---
# Moroccan Darija TTS
This is a text-to-speech (TTS) model for Moroccan Darija, fine-tuned from [OuteAI/OuteTTS-0.2-500M](https://huggingface.co/OuteAI/OuteTTS-0.2-500M) on the [KandirResearch/DarijaTTS-clean](https://huggingface.co/datasets/KandirResearch/DarijaTTS-clean) dataset.
## Model Details
- **Base Model:** [OuteAI/OuteTTS-0.2-500M](https://huggingface.co/OuteAI/OuteTTS-0.2-500M)
- **Dataset:** [KandirResearch/DarijaTTS-clean](https://huggingface.co/datasets/KandirResearch/DarijaTTS-clean)
- **Training Method:** Fine-tuned using Unsloth's `SFTTrainer`
- **Dataset Preparation:** Preprocessed following [OuteTTS training guide](https://github.com/edwko/OuteTTS/blob/main/examples/training/OuteTTS-0.3/train.md)
- **Demo:** [Try it here](https://huggingface.co/spaces/Lyte/DarijaTTS-test)
## Usage
> [!IMPORTANT]
> **Compatibility Note**
> Recent updates to `outetts` have introduced breaking changes. If you encounter the error:
> `AttributeError: module 'outetts' has no attribute 'GGUFModelConfig_v2'`
>
> **Solution:** Please install a compatible version (0.3.3 or 0.3.2) to resolve this:
> ```bash
> pip install outetts==0.3.3
> ```
You can run the model using `outetts` as follows:
Install `outetts` and `llama-cpp-python`:
```bash
pip install outetts==0.3.3 llama-cpp-python huggingface_hub
````
```python
import outetts
from outetts.models.config import GenerationConfig
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
repo_id="KandirResearch/DarijaTTS-v0.1-500M",
filename="unsloth.Q8_0.gguf",
)
model_config = outetts.GGUFModelConfig_v2(
model_path=model_path,
tokenizer_path="KandirResearch/DarijaTTS-v0.1-500M",
)
interface = outetts.InterfaceGGUF(model_version="0.3", cfg=model_config)
def tts(text, temperature=0.3, repetition_penalty=1.1):
gen_cfg = GenerationConfig(
text=text,
temperature=temperature,
repetition_penalty=repetition_penalty,
max_length=4096,
)
output = interface.generate(config=gen_cfg)
output_path = "output.wav"
output.save(output_path)
return output_path
# Example usage
audio_path = tts("السلام كيداير لاباس عليك؟")
print(f"Generated audio saved at: {audio_path}")
```
## Training
The model was fine-tuned using `Unsloth`'s `SFTTrainer`. The dataset was preprocessed following the [OuteTTS training guide](https://github.com/edwko/OuteTTS/blob/main/examples/training/OuteTTS-0.3/train.md). LoRA-based fine-tuning was applied to improve efficiency.
# Support Me
[](https://ko-fi.com/lyte)
-----
For any issues or improvements, feel free to open a discussion or PR\!

5132
added_tokens.json Normal file

File diff suppressed because it is too large Load Diff

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 896,
"initializer_range": 0.02,
"intermediate_size": 4864,
"max_position_embeddings": 32768,
"max_window_layers": 24,
"model_type": "qwen2",
"num_attention_heads": 14,
"num_hidden_layers": 24,
"num_key_value_heads": 2,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.2",
"use_cache": true,
"use_mrope": false,
"use_sliding_window": false,
"vocab_size": 157696
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"bos_token_id": 151643,
"eos_token_id": 151645,
"max_length": 32768,
"pad_token_id": 151643,
"transformers_version": "4.49.0"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7b90a66c14f18a54cb8e82b13960510fc8d4ac935bbddc8decef62539e15d8cb
size 998419456

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:64f5d8239d404122e7574dd2ac878787f98f99af0f67c696b91e8ca7c7af0bf6
size 998484818

5126
special_tokens_map.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c78b9d9bf7452c5b619ed0e4d1fc50f59e8ccb46d1ca148fa378c7ed008d9744
size 12367805

46167
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff

3
unsloth.Q8_0.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:239712dc9e590fb048779c609549ddfad3015faccc62c9c30eb053224ca898e4
size 536666752

1
vocab.json Normal file

File diff suppressed because one or more lines are too long