164 lines
3.8 KiB
Markdown
164 lines
3.8 KiB
Markdown
|
|
|
|||
|
|
---
|
|||
|
|
base_model: manotham/Thai-dialogue-translate_emotion
|
|||
|
|
|
|||
|
|
tags:
|
|||
|
|
- text-generation
|
|||
|
|
- transformers
|
|||
|
|
- unsloth
|
|||
|
|
- qwen3
|
|||
|
|
- translation
|
|||
|
|
- emotion-control
|
|||
|
|
- dpo
|
|||
|
|
|
|||
|
|
license: apache-2.0
|
|||
|
|
|
|||
|
|
language:
|
|||
|
|
- en
|
|||
|
|
- th
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# 🎭 Thai Dialogue Emotion-Aware Translator
|
|||
|
|
|
|||
|
|
A Qwen3-based fine-tuned model specialized for translating English dialogue into natural Thai while preserving emotional tone and cinematic style.
|
|||
|
|
|
|||
|
|
This model was trained using Unsloth and further optimized with DPO (Direct Preference Optimization) to improve translation naturalness and emotion alignment.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# 🚀 Highlights
|
|||
|
|
|
|||
|
|
* **Developer:** manotham
|
|||
|
|
* **Base Model:** Qwen3
|
|||
|
|
* **Training Framework:** Unsloth + HuggingFace TRL
|
|||
|
|
* **Optimization:** Post-DPO alignment
|
|||
|
|
* **Task:** English → Thai dialogue translation
|
|||
|
|
* **Special Feature:** Emotion-controlled translation
|
|||
|
|
|
|||
|
|
Supported emotions include:
|
|||
|
|
|
|||
|
|
* sadness
|
|||
|
|
* anger
|
|||
|
|
* contempt
|
|||
|
|
* frustration
|
|||
|
|
* joy
|
|||
|
|
* neutral
|
|||
|
|
* gratitude
|
|||
|
|
* love
|
|||
|
|
|
|||
|
|
and more following `tabularisai/multilingual-emotion-classification`.
|
|||
|
|
|
|||
|
|
The model is especially suitable for:
|
|||
|
|
|
|||
|
|
* Games
|
|||
|
|
* Visual novels
|
|||
|
|
* Cinematic dialogue
|
|||
|
|
* Storytelling
|
|||
|
|
* Fantasy / Anime-style conversations
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# 🛠 Usage
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from unsloth import FastLanguageModel
|
|||
|
|
import torch
|
|||
|
|
|
|||
|
|
model, tokenizer = FastLanguageModel.from_pretrained(
|
|||
|
|
model_name="manotham/Thai-dialogue-translate_emotion_mdpov2_ckp269",
|
|||
|
|
max_seq_length=1024,
|
|||
|
|
load_in_4bit=True,
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
FastLanguageModel.for_inference(model)
|
|||
|
|
|
|||
|
|
def translate_with_emotion(text, emotion):
|
|||
|
|
|
|||
|
|
messages = [
|
|||
|
|
{
|
|||
|
|
"role": "user",
|
|||
|
|
"content": f"Translate this English sentence into natural Thai.\n[Emotion: {emotion}]\n{text}"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
|
|||
|
|
inputs = tokenizer.apply_chat_template(
|
|||
|
|
messages,
|
|||
|
|
tokenize=True,
|
|||
|
|
add_generation_prompt=True,
|
|||
|
|
return_tensors="pt"
|
|||
|
|
).to("cuda")
|
|||
|
|
|
|||
|
|
with torch.inference_mode():
|
|||
|
|
outputs = model.generate(
|
|||
|
|
inputs,
|
|||
|
|
max_new_tokens=128,
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
return tokenizer.decode(
|
|||
|
|
outputs[0][inputs.shape[-1]:],
|
|||
|
|
skip_special_tokens=True
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
print(
|
|||
|
|
translate_with_emotion(
|
|||
|
|
"I found you in this broken world.",
|
|||
|
|
"joy"
|
|||
|
|
)
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# 📌 Example Outputs
|
|||
|
|
|
|||
|
|
## Sadness
|
|||
|
|
|
|||
|
|
**English**
|
|||
|
|
|
|||
|
|
> I don’t want to save this world anymore… I just want to stay by your side until the end.
|
|||
|
|
|
|||
|
|
**Thai**
|
|||
|
|
|
|||
|
|
> ฉันไม่อยากช่วยโลกใบนี้อีกแล้ว ฉันแค่อยากอยู่ข้างๆ เธอจนถึงท้ายที่สุดเท่านั้น
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Anger
|
|||
|
|
|
|||
|
|
**English**
|
|||
|
|
|
|||
|
|
> If justice only chooses the strong, then I’ll become the monster this world fears most.
|
|||
|
|
|
|||
|
|
**Thai**
|
|||
|
|
|
|||
|
|
> ถ้าความยุติธรรมเลือกคนที่แข็งแกร่งเท่านั้น ฉันก็จะกลายเป็นสัตว์ประหลาดที่โลกนี้กลัวที่สุด
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Contempt
|
|||
|
|
|
|||
|
|
**English**
|
|||
|
|
|
|||
|
|
> This is your so-called hero? I’ve seen weaker NPCs in a beginner dungeon.
|
|||
|
|
|
|||
|
|
**Thai**
|
|||
|
|
|
|||
|
|
> นี่คือฮีโร่ที่พวกแกอ้างถึงงั้นเหรอ? ข้าเคยเห็น NPC ที่อ่อนแอกว่านี้ในดันเจี้ยนระดับมือใหม่เลยนะ
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# ⚠ Limitations
|
|||
|
|
|
|||
|
|
* The model is optimized for dialogue-style translation, not formal documents.
|
|||
|
|
* Some outputs may become overly dramatic depending on the emotion tag.
|
|||
|
|
* Performance may vary for highly technical or domain-specific content.
|
|||
|
|
* The model may occasionally overfit to fantasy/anime dialogue style.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# 📚 Training Notes
|
|||
|
|
|
|||
|
|
* Fine-tuned using LoRA with Unsloth
|
|||
|
|
* Preference alignment using DPO
|
|||
|
|
* Quantization-friendly for low VRAM GPUs
|
|||
|
|
* Optimized and tested on consumer GPUs
|