初始化项目,由ModelHub XC社区提供模型
Model: mesolitica/Malaysian-orpheus-3b-0.1-ft Source: Original Platform
This commit is contained in:
105
README.md
Normal file
105
README.md
Normal file
@@ -0,0 +1,105 @@
|
||||
---
|
||||
library_name: transformers
|
||||
base_model:
|
||||
- canopylabs/orpheus-3b-0.1-ft
|
||||
---
|
||||
|
||||
# Malaysian canopylabs/orpheus-3b-0.1-ft
|
||||
|
||||
Finetune [canopylabs/orpheus-3b-0.1-ft](https://huggingface.co/canopylabs/orpheus-3b-0.1-ft) on standard Malay and minimal Mandarin.
|
||||
|
||||
## Training session
|
||||
|
||||
Finetune on [Mesolitica/TTS](https://huggingface.co/datasets/mesolitica/TTS) to make the model able to generate Malay voice with minimal Mandarin.
|
||||
|
||||
## How we train
|
||||
|
||||
1. LoRA on `["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", "lm_head"]`.
|
||||
2. 128 Rank with alpha 256, or alpha of 2.0, but during merging, we use 1.5 ratio.
|
||||
3. Multipacking with proper SDPA causal masking to prevent document contamination and also make sure proper position ids.
|
||||
4. Chunk CE loss to reduce memory.
|
||||
|
||||
Wandb at https://wandb.ai/huseinzol05/malay-orpheus-3b-0.1-ft-lora-128/workspace?nw=nwuserhuseinzol05
|
||||
|
||||
## Example
|
||||
|
||||
Load the model,
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
|
||||
from snac import SNAC
|
||||
import torch
|
||||
import IPython.display as ipd
|
||||
|
||||
def redistribute_codes(row):
|
||||
row_length = row.size(0)
|
||||
new_length = (row_length // 7) * 7
|
||||
trimmed_row = row[:new_length]
|
||||
code_list = [t - 128266 for t in trimmed_row]
|
||||
layer_1 = []
|
||||
layer_2 = []
|
||||
layer_3 = []
|
||||
for i in range((len(code_list)+1)//7):
|
||||
layer_1.append(code_list[7*i][None])
|
||||
layer_2.append(code_list[7*i+1][None]-4096)
|
||||
layer_3.append(code_list[7*i+2][None]-(2*4096))
|
||||
layer_3.append(code_list[7*i+3][None]-(3*4096))
|
||||
layer_2.append(code_list[7*i+4][None]-(4*4096))
|
||||
layer_3.append(code_list[7*i+5][None]-(5*4096))
|
||||
layer_3.append(code_list[7*i+6][None]-(6*4096))
|
||||
|
||||
with torch.no_grad():
|
||||
|
||||
codes = [torch.concat(layer_1),
|
||||
torch.concat(layer_2),
|
||||
torch.concat(layer_3)]
|
||||
|
||||
for i in range(len(codes)):
|
||||
codes[i][codes[i] < 0] = 0
|
||||
codes[i] = codes[i][None]
|
||||
|
||||
audio_hat = snac_model.decode(codes)
|
||||
return audio_hat.cpu()[0, 0]
|
||||
|
||||
snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz")
|
||||
snac_model = snac_model.to("cuda")
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained('mesolitica/Malaysian-orpheus-3b-0.1-ft')
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
'mesolitica/Malaysian-orpheus-3b-0.1-ft', torch_dtype = torch.bfloat16
|
||||
).cuda()
|
||||
|
||||
speakers = [
|
||||
'Husein',
|
||||
'Shafiqah Idayu',
|
||||
'Anwar Ibrahim',
|
||||
'KP'
|
||||
]
|
||||
|
||||
speaker = speakers[0]
|
||||
text = 'Nama saya Husein, saya tak suka nasi ayam dan tak suka mandi, Xiàn zài wǒ yǒu bing chilling Wǒ hěn xǐ huān bing chilling.'
|
||||
prompt = f'<custom_token_3><|begin_of_text|>{speaker}: {text}<|eot_id|><custom_token_4><custom_token_5><custom_token_1>'
|
||||
input_ids = tokenizer(prompt,add_special_tokens = False, return_tensors = 'pt').to('cuda')
|
||||
|
||||
with torch.no_grad():
|
||||
generated_ids = model.generate(
|
||||
**input_ids,
|
||||
max_new_tokens=1200,
|
||||
do_sample=True,
|
||||
temperature=0.9,
|
||||
top_p=0.8,
|
||||
repetition_penalty=1.1,
|
||||
num_return_sequences=1,
|
||||
eos_token_id=128258,
|
||||
)
|
||||
|
||||
row = generated_ids[0, input_ids['input_ids'].shape[1]:]
|
||||
y_ = redistribute_codes(row)
|
||||
ipd.Audio(y_, rate = 24000)
|
||||
```
|
||||
|
||||
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/5e73316106936008a9ee6523/NIOtl7B6Myw1eBd5Lf76l.wav"></audio>
|
||||
|
||||
## Source code
|
||||
|
||||
Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/orpheus
|
||||
Reference in New Issue
Block a user