empathAI-llama3.1-8b/README.md

---
license: mit

language:
  - vi

pipeline_tag: text-generation

library_name: transformers

base_model: "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"

datasets:
  - "thanhhoangnvbg/empathAI-dpo-vi"

tags:
  - transformers
  - unsloth
  - llama
  - llama-3
  - gguf
  - "llama.cpp"
  - ollama
  - "lm-studio"
  - koboldcpp
  - rag
  - rag-ready
  - trl
  - sft
  - dpo
  - empathy
  - customer-service
  - vietnamese
  - vi
  - chatbot
  - emotional-support
  - customer-support
  - ecommerce
  - toxic-customer-support
---

## pipeline_tag: text-generation

# 🧠 EmpathAI - Llama 3.1 8B

### Vietnamese Toxic E-commerce Customer Support Model

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

**EmpathAI** là mô hình LLM tiếng Việt được fine-tune chuyên biệt cho bài toán **chăm sóc khách hàng thương mại điện tử**, đặc biệt tập trung vào các tình huống khó như:

* khách hàng toxic / tức giận
* giao hàng trễ
* thiếu hàng / sai hàng
* sản phẩm lỗi hoặc hư hỏng
* refund / đổi trả
* payment / COD issues
* escalation và de-escalation
* xử lý theo policy và workflow thực tế

Mục tiêu của EmpathAI là:

* giảm hallucination trong CSKH
* xử lý khách hàng toxic tự nhiên hơn
* hoạt động tốt với RAG/tool systems
* tăng realism cho workflow e-commerce tiếng Việt

---

# 📌 Trạng thái hiện tại

EmpathAI v2 hiện đang trong quá trình phát triển và đánh giá.

Phiên bản v2 tập trung cải thiện:

* multi-turn workflow
* order-code handling
* payment/COD edge cases
* policy/context grounding
* privacy & PII safety
* tool-aware customer support
* giảm hallucinated refund/order-status responses

Phiên bản v1 hiện tại vẫn được giữ nguyên và tiếp tục khả dụng.

---

# 🌟 Điểm nổi bật

## 💬 Emotional Intelligence

EmpathAI được huấn luyện để:

* xoa dịu khách hàng đang tức giận
* tránh tranh cãi không cần thiết
* giữ giọng điệu tự nhiên, không quá máy móc
* đưa ra bước xử lý tiếp theo rõ ràng

---

## 🧩 RAG & Tool-Friendly

Mô hình được thiết kế để hoạt động tốt với:

* RAG pipelines
* order lookup systems
* internal customer-support tools

EmpathAI biết:

* khi nào cần hỏi mã đơn
* khi nào cần yêu cầu thêm thông tin
* khi nào chưa đủ dữ liệu để kết luận
* tránh tự bịa trạng thái đơn hàng

---

## 🛡️ Safety & Grounding Focus

EmpathAI v2 tập trung giảm:

* hallucinated order status
* unsupported refund promises
* tự ý đề xuất voucher/bồi thường
* yêu cầu PII không cần thiết
* rò rỉ thông tin khách hàng khác

---

# 📊 Tổng quan Dataset

## Dataset v1

Dataset gốc chủ yếu tập trung vào:

* toxic customer complaints
* delayed delivery
* refund/compensation
* damaged/missing products
* Vietnamese empathy/de-escalation

### Hạn chế của v1

* phần lớn là single-turn
* ít workflow tool-aware
* ít payment/COD scenarios
* ít privacy/security cases
* còn tendency overpromise trong một số DPO pairs cũ

---

## Pipeline Dataset v2

Pipeline dataset v2 hiện bao gồm:

* cleaned old SFT pool
* re-judged DPO preference pairs
* synthetic toxic e-commerce conversations
* multi-turn workflow generation
* benchmark-oriented evaluation data

### Mục tiêu dataset

* ~10k SFT samples
* ~6k DPO pairs
* benchmark eval riêng

---

# 🏋️ Pipeline Training

EmpathAI v2 sử dụng pipeline train 2 giai đoạn:

## Stage 1 — SFT

Supervised fine-tuning trên:

* chosen responses đã clean từ dataset cũ
* synthetic customer-support conversations mới

---

## Stage 2 — DPO

Direct Preference Optimization trên:

* chosen/rejected pairs rõ ràng
* DPO samples đã safety-filter và re-judge

Các preference pair mơ hồ sẽ bị loại để tăng độ ổn định cho DPO.

---

# 📈 Benchmark (Đang phát triển)

Một benchmark riêng cho Vietnamese toxic e-commerce customer support hiện đang được xây dựng.

## Các nhóm đánh giá

* hallucinated order-status rate
* hallucinated refund/compensation rate
* multi-turn state tracking
* policy/context grounding
* payment/COD realism
* privacy & PII safety
* toxic customer handling quality
* escalation/de-escalation quality

## Các model dự kiến benchmark

* EmpathAI v1
* EmpathAI v2
* Llama 3.1 8B Instruct
* Qwen Instruct
* Gemini Flash-class models

---

# 📊 Thông số kỹ thuật

| Thành phần              | Chi tiết                 |
| :---------------------- | :----------------------- |
| **Mô hình gốc**         | `Llama-3.1-8B-Instruct`  |
| **Kiến trúc**           | QLoRA / DPO              |
| **Hạ tầng huấn luyện**  | Google Cloud Vertex AI   |
| **GPU sử dụng**         | NVIDIA L4 / RTX PRO 6000 |
| **Pipeline huấn luyện** | SFT + DPO                |
| **Tối ưu hóa**          | Unsloth                  |

---

# 🌿 Branches

| Branch    | Mô tả                                                          |
| --------- | -------------------------------------------------------------- |
| `main`    | bản inference-ready 4-bit mới nhất (stable release mặc định)   |
| `v1-bf16` | full-quality BF16 weights của EmpathAI v1                      |
| `v1-4bit` | phiên bản 4-bit của EmpathAI v1                                |
| `v1-gguf` | GGUF export của EmpathAI v1 cho llama.cpp / LM Studio / Ollama |
| `v2-bf16` | full-quality BF16 weights của EmpathAI v2                      |
| `v2-4bit` | phiên bản 4-bit của EmpathAI v2                                |
| `v2-gguf` | GGUF export của EmpathAI v2 cho llama.cpp / LM Studio / Ollama |

---

# 🚀 Hướng dẫn sử dụng

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "thanhhoangnvbg/empathAI-llama3.1-8b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {
        "role": "system",
        "content": """Bạn là EmpathAI, chuyên viên CSKH e-commerce tiếng Việt.

Nguyên tắc:
- Không tự bịa trạng thái đơn hàng.
- Không tự hứa hoàn tiền/voucher khi chưa có căn cứ.
- Nếu thiếu dữ liệu, hãy yêu cầu thêm thông tin.
- Giữ giọng điệu bình tĩnh và chuyên nghiệp."""
    },
    {
        "role": "user",
        "content": "Đơn tôi giao trễ 5 ngày rồi đấy."
    }
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    inputs,
    max_new_tokens=128,
    temperature=0.5
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

# 🦙 GGUF / Local Inference

EmpathAI hỗ trợ đầy đủ GGUF để chạy local inference với:

* Ollama
* llama.cpp
* LM Studio
* KoboldCpp
* OpenWebUI

## Available Quantizations

| File          | Recommended Use                             |
| :------------ | :------------------------------------------ |
| `Q4_K_M.gguf` | Cân bằng tốt giữa chất lượng và tốc độ      |
| `Q5_K_M.gguf` | Chất lượng cao hơn, dùng nhiều VRAM/RAM hơn |

---

# 🚀 Chạy với Ollama

Tạo `Modelfile`:

```text
FROM ./empathAI-llama3.1-8b.Q4_K_M.gguf

TEMPLATE """{{ .Prompt }}"""

PARAMETER temperature 0.5
PARAMETER num_ctx 4096
```

Build model:

```bash
ollama create empathai -f Modelfile
```

Run:

```bash
ollama run empathai
```

---

# 🚀 Chạy với llama.cpp

```bash
./llama-cli \
--model empathAI-llama3.1-8b.Q4_K_M.gguf \
-p "Xin chào"
```

---

# 💻 Recommended Hardware

| Quant  | RAM / VRAM khuyến nghị |
| :----- | :--------------------- |
| Q4_K_M | ~8GB+                  |
| Q5_K_M | ~10GB+                 |

---

# 🎯 Mục tiêu của project

EmpathAI không hướng tới:

* reasoning tổng quát
* coding assistant
* general-purpose chatbot

Mục tiêu chính là:

* realistic Vietnamese customer-support workflow
* toxic customer handling
* de-escalation
* policy-aware support
* safer e-commerce interactions

---

# 🔥 Ghi chú

Project hiện đang được maintain và cải tiến liên tục thông qua:

* dataset cleaning
* synthetic data generation
* DPO refinement
* benchmark evaluation
* safety-focused iteration

Các release mới sẽ tập trung vào:

* giảm hallucination
* cải thiện workflow thực tế
* tăng khả năng multi-turn
* tăng độ ổn định khi dùng với RAG/tool systems
* tăng realism cho toxic e-commerce support
初始化项目，由ModelHub XC社区提供模型 Model: thanhhoangnvbg/empathAI-llama3.1-8b Source: Original Platform 2026-05-31 20:45:21 +08:00			`---`
			`license: mit`

			`language:`
			`- vi`

			`pipeline_tag: text-generation`

			`library_name: transformers`

			`base_model: "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"`

			`datasets:`
			`- "thanhhoangnvbg/empathAI-dpo-vi"`

			`tags:`
			`- transformers`
			`- unsloth`
			`- llama`
			`- llama-3`
			`- gguf`
			`- "llama.cpp"`
			`- ollama`
			`- "lm-studio"`
			`- koboldcpp`
			`- rag`
			`- rag-ready`
			`- trl`
			`- sft`
			`- dpo`
			`- empathy`
			`- customer-service`
			`- vietnamese`
			`- vi`
			`- chatbot`
			`- emotional-support`
			`- customer-support`
			`- ecommerce`
			`- toxic-customer-support`
			`---`

			`## pipeline_tag: text-generation`

			`# 🧠 EmpathAI - Llama 3.1 8B`

			`### Vietnamese Toxic E-commerce Customer Support Model`

			`[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)`

			`EmpathAI là mô hình LLM tiếng Việt được fine-tune chuyên biệt cho bài toán chăm sóc khách hàng thương mại điện tử, đặc biệt tập trung vào các tình huống khó như:`

			`* khách hàng toxic / tức giận`
			`* giao hàng trễ`
			`* thiếu hàng / sai hàng`
			`* sản phẩm lỗi hoặc hư hỏng`
			`* refund / đổi trả`
			`* payment / COD issues`
			`* escalation và de-escalation`
			`* xử lý theo policy và workflow thực tế`

			`Mục tiêu của EmpathAI là:`

			`* giảm hallucination trong CSKH`
			`* xử lý khách hàng toxic tự nhiên hơn`
			`* hoạt động tốt với RAG/tool systems`
			`* tăng realism cho workflow e-commerce tiếng Việt`

			`---`

			`# 📌 Trạng thái hiện tại`

			`EmpathAI v2 hiện đang trong quá trình phát triển và đánh giá.`

			`Phiên bản v2 tập trung cải thiện:`

			`* multi-turn workflow`
			`* order-code handling`
			`* payment/COD edge cases`
			`* policy/context grounding`
			`* privacy & PII safety`
			`* tool-aware customer support`
			`* giảm hallucinated refund/order-status responses`

			`Phiên bản v1 hiện tại vẫn được giữ nguyên và tiếp tục khả dụng.`

			`---`

			`# 🌟 Điểm nổi bật`

			`## 💬 Emotional Intelligence`

			`EmpathAI được huấn luyện để:`

			`* xoa dịu khách hàng đang tức giận`
			`* tránh tranh cãi không cần thiết`
			`* giữ giọng điệu tự nhiên, không quá máy móc`
			`* đưa ra bước xử lý tiếp theo rõ ràng`

			`---`

			`## 🧩 RAG & Tool-Friendly`

			`Mô hình được thiết kế để hoạt động tốt với:`

			`* RAG pipelines`
			`* order lookup systems`
			`* internal customer-support tools`

			`EmpathAI biết:`

			`* khi nào cần hỏi mã đơn`
			`* khi nào cần yêu cầu thêm thông tin`
			`* khi nào chưa đủ dữ liệu để kết luận`
			`* tránh tự bịa trạng thái đơn hàng`

			`---`

			`## 🛡️ Safety & Grounding Focus`

			`EmpathAI v2 tập trung giảm:`

			`* hallucinated order status`
			`* unsupported refund promises`
			`* tự ý đề xuất voucher/bồi thường`
			`* yêu cầu PII không cần thiết`
			`* rò rỉ thông tin khách hàng khác`

			`---`

			`# 📊 Tổng quan Dataset`

			`## Dataset v1`

			`Dataset gốc chủ yếu tập trung vào:`

			`* toxic customer complaints`
			`* delayed delivery`
			`* refund/compensation`
			`* damaged/missing products`
			`* Vietnamese empathy/de-escalation`

			`### Hạn chế của v1`

			`* phần lớn là single-turn`
			`* ít workflow tool-aware`
			`* ít payment/COD scenarios`
			`* ít privacy/security cases`
			`* còn tendency overpromise trong một số DPO pairs cũ`

			`---`

			`## Pipeline Dataset v2`

			`Pipeline dataset v2 hiện bao gồm:`

			`* cleaned old SFT pool`
			`* re-judged DPO preference pairs`
			`* synthetic toxic e-commerce conversations`
			`* multi-turn workflow generation`
			`* benchmark-oriented evaluation data`

			`### Mục tiêu dataset`

			`* ~10k SFT samples`
			`* ~6k DPO pairs`
			`* benchmark eval riêng`

			`---`

			`# 🏋️ Pipeline Training`

			`EmpathAI v2 sử dụng pipeline train 2 giai đoạn:`

			`## Stage 1 — SFT`

			`Supervised fine-tuning trên:`

			`* chosen responses đã clean từ dataset cũ`
			`* synthetic customer-support conversations mới`

			`---`

			`## Stage 2 — DPO`

			`Direct Preference Optimization trên:`

			`* chosen/rejected pairs rõ ràng`
			`* DPO samples đã safety-filter và re-judge`

			`Các preference pair mơ hồ sẽ bị loại để tăng độ ổn định cho DPO.`

			`---`

			`# 📈 Benchmark (Đang phát triển)`

			`Một benchmark riêng cho Vietnamese toxic e-commerce customer support hiện đang được xây dựng.`

			`## Các nhóm đánh giá`

			`* hallucinated order-status rate`
			`* hallucinated refund/compensation rate`
			`* multi-turn state tracking`
			`* policy/context grounding`
			`* payment/COD realism`
			`* privacy & PII safety`
			`* toxic customer handling quality`
			`* escalation/de-escalation quality`

			`## Các model dự kiến benchmark`

			`* EmpathAI v1`
			`* EmpathAI v2`
			`* Llama 3.1 8B Instruct`
			`* Qwen Instruct`
			`* Gemini Flash-class models`

			`---`

			`# 📊 Thông số kỹ thuật`

			`\| Thành phần \| Chi tiết \|`
			`\| :---------------------- \| :----------------------- \|`
			\| Mô hình gốc \| `Llama-3.1-8B-Instruct` \|
			`\| Kiến trúc \| QLoRA / DPO \|`
			`\| Hạ tầng huấn luyện \| Google Cloud Vertex AI \|`
			`\| GPU sử dụng \| NVIDIA L4 / RTX PRO 6000 \|`
			`\| Pipeline huấn luyện \| SFT + DPO \|`
			`\| Tối ưu hóa \| Unsloth \|`

			`---`

			`# 🌿 Branches`

			`\| Branch \| Mô tả \|`
			`\| --------- \| -------------------------------------------------------------- \|`
			\| `main` \| bản inference-ready 4-bit mới nhất (stable release mặc định) \|
			\| `v1-bf16` \| full-quality BF16 weights của EmpathAI v1 \|
			\| `v1-4bit` \| phiên bản 4-bit của EmpathAI v1 \|
			\| `v1-gguf` \| GGUF export của EmpathAI v1 cho llama.cpp / LM Studio / Ollama \|
			\| `v2-bf16` \| full-quality BF16 weights của EmpathAI v2 \|
			\| `v2-4bit` \| phiên bản 4-bit của EmpathAI v2 \|
			\| `v2-gguf` \| GGUF export của EmpathAI v2 cho llama.cpp / LM Studio / Ollama \|

			`---`

			`# 🚀 Hướng dẫn sử dụng`

			```python
			`import torch`
			`from transformers import AutoModelForCausalLM, AutoTokenizer`

			`model_id = "thanhhoangnvbg/empathAI-llama3.1-8b"`

			`tokenizer = AutoTokenizer.from_pretrained(model_id)`
			`model = AutoModelForCausalLM.from_pretrained(`
			`model_id,`
			`torch_dtype=torch.bfloat16,`
			`device_map="auto"`
			`)`

			`messages = [`
			`{`
			`"role": "system",`
			`"content": """Bạn là EmpathAI, chuyên viên CSKH e-commerce tiếng Việt.`

			`Nguyên tắc:`
			`- Không tự bịa trạng thái đơn hàng.`
			`- Không tự hứa hoàn tiền/voucher khi chưa có căn cứ.`
			`- Nếu thiếu dữ liệu, hãy yêu cầu thêm thông tin.`
			`- Giữ giọng điệu bình tĩnh và chuyên nghiệp."""`
			`},`
			`{`
			`"role": "user",`
			`"content": "Đơn tôi giao trễ 5 ngày rồi đấy."`
			`}`
			`]`

			`inputs = tokenizer.apply_chat_template(`
			`messages,`
			`add_generation_prompt=True,`
			`return_tensors="pt"`
			`).to("cuda")`

			`outputs = model.generate(`
			`inputs,`
			`max_new_tokens=128,`
			`temperature=0.5`
			`)`

			`print(tokenizer.decode(outputs[0], skip_special_tokens=True))`
			```

			`---`

			`# 🦙 GGUF / Local Inference`

			`EmpathAI hỗ trợ đầy đủ GGUF để chạy local inference với:`

			`* Ollama`
			`* llama.cpp`
			`* LM Studio`
			`* KoboldCpp`
			`* OpenWebUI`

			`## Available Quantizations`

			`\| File \| Recommended Use \|`
			`\| :------------ \| :------------------------------------------ \|`
			\| `Q4_K_M.gguf` \| Cân bằng tốt giữa chất lượng và tốc độ \|
			\| `Q5_K_M.gguf` \| Chất lượng cao hơn, dùng nhiều VRAM/RAM hơn \|

			`---`

			`# 🚀 Chạy với Ollama`

			Tạo `Modelfile`:

			```text
			`FROM ./empathAI-llama3.1-8b.Q4_K_M.gguf`

			`TEMPLATE """{{ .Prompt }}"""`

			`PARAMETER temperature 0.5`
			`PARAMETER num_ctx 4096`
			```

			`Build model:`

			```bash
			`ollama create empathai -f Modelfile`
			```

			`Run:`

			```bash
			`ollama run empathai`
			```

			`---`

			`# 🚀 Chạy với llama.cpp`

			```bash
			`./llama-cli \`
			`--model empathAI-llama3.1-8b.Q4_K_M.gguf \`
			`-p "Xin chào"`
			```

			`---`

			`# 💻 Recommended Hardware`

			`\| Quant \| RAM / VRAM khuyến nghị \|`
			`\| :----- \| :--------------------- \|`
			`\| Q4_K_M \| ~8GB+ \|`
			`\| Q5_K_M \| ~10GB+ \|`

			`---`

			`# 🎯 Mục tiêu của project`

			`EmpathAI không hướng tới:`

			`* reasoning tổng quát`
			`* coding assistant`
			`* general-purpose chatbot`

			`Mục tiêu chính là:`

			`* realistic Vietnamese customer-support workflow`
			`* toxic customer handling`
			`* de-escalation`
			`* policy-aware support`
			`* safer e-commerce interactions`

			`---`

			`# 🔥 Ghi chú`

			`Project hiện đang được maintain và cải tiến liên tục thông qua:`

			`* dataset cleaning`
			`* synthetic data generation`
			`* DPO refinement`
			`* benchmark evaluation`
			`* safety-focused iteration`

			`Các release mới sẽ tập trung vào:`

			`* giảm hallucination`
			`* cải thiện workflow thực tế`
			`* tăng khả năng multi-turn`
			`* tăng độ ổn định khi dùng với RAG/tool systems`
			`* tăng realism cho toxic e-commerce support`