empathAI-llama3.1-8b/README.md

---
license: mit

language:
  - vi

pipeline_tag: text-generation

library_name: transformers

base_model: "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"

datasets:
  - "thanhhoangnvbg/empathAI-dpo-vi"

tags:
  - transformers
  - unsloth
  - llama
  - llama-3
  - gguf
  - "llama.cpp"
  - ollama
  - "lm-studio"
  - koboldcpp
  - rag
  - rag-ready
  - trl
  - sft
  - dpo
  - empathy
  - customer-service
  - vietnamese
  - vi
  - chatbot
  - emotional-support
  - customer-support
  - ecommerce
  - toxic-customer-support
---

## pipeline_tag: text-generation

# 🧠 EmpathAI - Llama 3.1 8B

### Vietnamese Toxic E-commerce Customer Support Model

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

**EmpathAI** là mô hình LLM tiếng Việt được fine-tune chuyên biệt cho bài toán **chăm sóc khách hàng thương mại điện tử**, đặc biệt tập trung vào các tình huống khó như:

* khách hàng toxic / tức giận
* giao hàng trễ
* thiếu hàng / sai hàng
* sản phẩm lỗi hoặc hư hỏng
* refund / đổi trả
* payment / COD issues
* escalation và de-escalation
* xử lý theo policy và workflow thực tế

Mục tiêu của EmpathAI là:

* giảm hallucination trong CSKH
* xử lý khách hàng toxic tự nhiên hơn
* hoạt động tốt với RAG/tool systems
* tăng realism cho workflow e-commerce tiếng Việt

---

# 📌 Trạng thái hiện tại

EmpathAI v2 hiện đang trong quá trình phát triển và đánh giá.

Phiên bản v2 tập trung cải thiện:

* multi-turn workflow
* order-code handling
* payment/COD edge cases
* policy/context grounding
* privacy & PII safety
* tool-aware customer support
* giảm hallucinated refund/order-status responses

Phiên bản v1 hiện tại vẫn được giữ nguyên và tiếp tục khả dụng.

---

# 🌟 Điểm nổi bật

## 💬 Emotional Intelligence

EmpathAI được huấn luyện để:

* xoa dịu khách hàng đang tức giận
* tránh tranh cãi không cần thiết
* giữ giọng điệu tự nhiên, không quá máy móc
* đưa ra bước xử lý tiếp theo rõ ràng

---

## 🧩 RAG & Tool-Friendly

Mô hình được thiết kế để hoạt động tốt với:

* RAG pipelines
* order lookup systems
* internal customer-support tools

EmpathAI biết:

* khi nào cần hỏi mã đơn
* khi nào cần yêu cầu thêm thông tin
* khi nào chưa đủ dữ liệu để kết luận
* tránh tự bịa trạng thái đơn hàng

---

## 🛡️ Safety & Grounding Focus

EmpathAI v2 tập trung giảm:

* hallucinated order status
* unsupported refund promises
* tự ý đề xuất voucher/bồi thường
* yêu cầu PII không cần thiết
* rò rỉ thông tin khách hàng khác

---

# 📊 Tổng quan Dataset

## Dataset v1

Dataset gốc chủ yếu tập trung vào:

* toxic customer complaints
* delayed delivery
* refund/compensation
* damaged/missing products
* Vietnamese empathy/de-escalation

### Hạn chế của v1

* phần lớn là single-turn
* ít workflow tool-aware
* ít payment/COD scenarios
* ít privacy/security cases
* còn tendency overpromise trong một số DPO pairs cũ

---

## Pipeline Dataset v2

Pipeline dataset v2 hiện bao gồm:

* cleaned old SFT pool
* re-judged DPO preference pairs
* synthetic toxic e-commerce conversations
* multi-turn workflow generation
* benchmark-oriented evaluation data

### Mục tiêu dataset

* ~10k SFT samples
* ~6k DPO pairs
* benchmark eval riêng

---

# 🏋️ Pipeline Training

EmpathAI v2 sử dụng pipeline train 2 giai đoạn:

## Stage 1 — SFT

Supervised fine-tuning trên:

* chosen responses đã clean từ dataset cũ
* synthetic customer-support conversations mới

---

## Stage 2 — DPO

Direct Preference Optimization trên:

* chosen/rejected pairs rõ ràng
* DPO samples đã safety-filter và re-judge

Các preference pair mơ hồ sẽ bị loại để tăng độ ổn định cho DPO.

---

# 📈 Benchmark (Đang phát triển)

Một benchmark riêng cho Vietnamese toxic e-commerce customer support hiện đang được xây dựng.

## Các nhóm đánh giá

* hallucinated order-status rate
* hallucinated refund/compensation rate
* multi-turn state tracking
* policy/context grounding
* payment/COD realism
* privacy & PII safety
* toxic customer handling quality
* escalation/de-escalation quality

## Các model dự kiến benchmark

* EmpathAI v1
* EmpathAI v2
* Llama 3.1 8B Instruct
* Qwen Instruct
* Gemini Flash-class models

---

# 📊 Thông số kỹ thuật

| Thành phần              | Chi tiết                 |
| :---------------------- | :----------------------- |
| **Mô hình gốc**         | `Llama-3.1-8B-Instruct`  |
| **Kiến trúc**           | QLoRA / DPO              |
| **Hạ tầng huấn luyện**  | Google Cloud Vertex AI   |
| **GPU sử dụng**         | NVIDIA L4 / RTX PRO 6000 |
| **Pipeline huấn luyện** | SFT + DPO                |
| **Tối ưu hóa**          | Unsloth                  |

---

# 🌿 Branches

| Branch    | Mô tả                                                          |
| --------- | -------------------------------------------------------------- |
| `main`    | bản inference-ready 4-bit mới nhất (stable release mặc định)   |
| `v1-bf16` | full-quality BF16 weights của EmpathAI v1                      |
| `v1-4bit` | phiên bản 4-bit của EmpathAI v1                                |
| `v1-gguf` | GGUF export của EmpathAI v1 cho llama.cpp / LM Studio / Ollama |
| `v2-bf16` | full-quality BF16 weights của EmpathAI v2                      |
| `v2-4bit` | phiên bản 4-bit của EmpathAI v2                                |
| `v2-gguf` | GGUF export của EmpathAI v2 cho llama.cpp / LM Studio / Ollama |

---

# 🚀 Hướng dẫn sử dụng

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "thanhhoangnvbg/empathAI-llama3.1-8b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {
        "role": "system",
        "content": """Bạn là EmpathAI, chuyên viên CSKH e-commerce tiếng Việt.

Nguyên tắc:
- Không tự bịa trạng thái đơn hàng.
- Không tự hứa hoàn tiền/voucher khi chưa có căn cứ.
- Nếu thiếu dữ liệu, hãy yêu cầu thêm thông tin.
- Giữ giọng điệu bình tĩnh và chuyên nghiệp."""
    },
    {
        "role": "user",
        "content": "Đơn tôi giao trễ 5 ngày rồi đấy."
    }
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    inputs,
    max_new_tokens=128,
    temperature=0.5
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

# 🦙 GGUF / Local Inference

EmpathAI hỗ trợ đầy đủ GGUF để chạy local inference với:

* Ollama
* llama.cpp
* LM Studio
* KoboldCpp
* OpenWebUI

## Available Quantizations

| File          | Recommended Use                             |
| :------------ | :------------------------------------------ |
| `Q4_K_M.gguf` | Cân bằng tốt giữa chất lượng và tốc độ      |
| `Q5_K_M.gguf` | Chất lượng cao hơn, dùng nhiều VRAM/RAM hơn |

---

# 🚀 Chạy với Ollama

Tạo `Modelfile`:

```text
FROM ./empathAI-llama3.1-8b.Q4_K_M.gguf

TEMPLATE """{{ .Prompt }}"""

PARAMETER temperature 0.5
PARAMETER num_ctx 4096
```

Build model:

```bash
ollama create empathai -f Modelfile
```

Run:

```bash
ollama run empathai
```

---

# 🚀 Chạy với llama.cpp

```bash
./llama-cli \
--model empathAI-llama3.1-8b.Q4_K_M.gguf \
-p "Xin chào"
```

---

# 💻 Recommended Hardware

| Quant  | RAM / VRAM khuyến nghị |
| :----- | :--------------------- |
| Q4_K_M | ~8GB+                  |
| Q5_K_M | ~10GB+                 |

---

# 🎯 Mục tiêu của project

EmpathAI không hướng tới:

* reasoning tổng quát
* coding assistant
* general-purpose chatbot

Mục tiêu chính là:

* realistic Vietnamese customer-support workflow
* toxic customer handling
* de-escalation
* policy-aware support
* safer e-commerce interactions

---

# 🔥 Ghi chú

Project hiện đang được maintain và cải tiến liên tục thông qua:

* dataset cleaning
* synthetic data generation
* DPO refinement
* benchmark evaluation
* safety-focused iteration

Các release mới sẽ tập trung vào:

* giảm hallucination
* cải thiện workflow thực tế
* tăng khả năng multi-turn
* tăng độ ổn định khi dùng với RAG/tool systems
* tăng realism cho toxic e-commerce support