初始化项目，由ModelHub XC社区提供模型

Model: thanhhoangnvbg/empathAI-llama3.1-8b Source: Original Platform
2026-05-31 20:45:21 +08:00
commit 332f1c539e
9 changed files with 545 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,394 @@
+---
+license: mit
+
+language:
+  - vi
+
+pipeline_tag: text-generation
+
+library_name: transformers
+
+base_model: "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
+
+datasets:
+  - "thanhhoangnvbg/empathAI-dpo-vi"
+
+tags:
+  - transformers
+  - unsloth
+  - llama
+  - llama-3
+  - gguf
+  - "llama.cpp"
+  - ollama
+  - "lm-studio"
+  - koboldcpp
+  - rag
+  - rag-ready
+  - trl
+  - sft
+  - dpo
+  - empathy
+  - customer-service
+  - vietnamese
+  - vi
+  - chatbot
+  - emotional-support
+  - customer-support
+  - ecommerce
+  - toxic-customer-support
+---
+
+## pipeline_tag: text-generation
+
+# 🧠 EmpathAI - Llama 3.1 8B
+
+### Vietnamese Toxic E-commerce Customer Support Model
+
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+
+**EmpathAI** là mô hình LLM tiếng Việt được fine-tune chuyên biệt cho bài toán **chăm sóc khách hàng thương mại điện tử**, đặc biệt tập trung vào các tình huống khó như:
+
+* khách hàng toxic / tức giận
+* giao hàng trễ
+* thiếu hàng / sai hàng
+* sản phẩm lỗi hoặc hư hỏng
+* refund / đổi trả
+* payment / COD issues
+* escalation và de-escalation
+* xử lý theo policy và workflow thực tế
+
+Mục tiêu của EmpathAI là:
+
+* giảm hallucination trong CSKH
+* xử lý khách hàng toxic tự nhiên hơn
+* hoạt động tốt với RAG/tool systems
+* tăng realism cho workflow e-commerce tiếng Việt
+
+---
+
+# 📌 Trạng thái hiện tại
+
+EmpathAI v2 hiện đang trong quá trình phát triển và đánh giá.
+
+Phiên bản v2 tập trung cải thiện:
+
+* multi-turn workflow
+* order-code handling
+* payment/COD edge cases
+* policy/context grounding
+* privacy & PII safety
+* tool-aware customer support
+* giảm hallucinated refund/order-status responses
+
+Phiên bản v1 hiện tại vẫn được giữ nguyên và tiếp tục khả dụng.
+
+---
+
+# 🌟 Điểm nổi bật
+
+## 💬 Emotional Intelligence
+
+EmpathAI được huấn luyện để:
+
+* xoa dịu khách hàng đang tức giận
+* tránh tranh cãi không cần thiết
+* giữ giọng điệu tự nhiên, không quá máy móc
+* đưa ra bước xử lý tiếp theo rõ ràng
+
+---
+
+## 🧩 RAG & Tool-Friendly
+
+Mô hình được thiết kế để hoạt động tốt với:
+
+* RAG pipelines
+* order lookup systems
+* internal customer-support tools
+
+EmpathAI biết:
+
+* khi nào cần hỏi mã đơn
+* khi nào cần yêu cầu thêm thông tin
+* khi nào chưa đủ dữ liệu để kết luận
+* tránh tự bịa trạng thái đơn hàng
+
+---
+
+## 🛡️ Safety & Grounding Focus
+
+EmpathAI v2 tập trung giảm:
+
+* hallucinated order status
+* unsupported refund promises
+* tự ý đề xuất voucher/bồi thường
+* yêu cầu PII không cần thiết
+* rò rỉ thông tin khách hàng khác
+
+---
+
+# 📊 Tổng quan Dataset
+
+## Dataset v1
+
+Dataset gốc chủ yếu tập trung vào:
+
+* toxic customer complaints
+* delayed delivery
+* refund/compensation
+* damaged/missing products
+* Vietnamese empathy/de-escalation
+
+### Hạn chế của v1
+
+* phần lớn là single-turn
+* ít workflow tool-aware
+* ít payment/COD scenarios
+* ít privacy/security cases
+* còn tendency overpromise trong một số DPO pairs cũ
+
+---
+
+## Pipeline Dataset v2
+
+Pipeline dataset v2 hiện bao gồm:
+
+* cleaned old SFT pool
+* re-judged DPO preference pairs
+* synthetic toxic e-commerce conversations
+* multi-turn workflow generation
+* benchmark-oriented evaluation data
+
+### Mục tiêu dataset
+
+* ~10k SFT samples
+* ~6k DPO pairs
+* benchmark eval riêng
+
+---
+
+# 🏋️ Pipeline Training
+
+EmpathAI v2 sử dụng pipeline train 2 giai đoạn:
+
+## Stage 1 — SFT
+
+Supervised fine-tuning trên:
+
+* chosen responses đã clean từ dataset cũ
+* synthetic customer-support conversations mới
+
+---
+
+## Stage 2 — DPO
+
+Direct Preference Optimization trên:
+
+* chosen/rejected pairs rõ ràng
+* DPO samples đã safety-filter và re-judge
+
+Các preference pair mơ hồ sẽ bị loại để tăng độ ổn định cho DPO.
+
+---
+
+# 📈 Benchmark (Đang phát triển)
+
+Một benchmark riêng cho Vietnamese toxic e-commerce customer support hiện đang được xây dựng.
+
+## Các nhóm đánh giá
+
+* hallucinated order-status rate
+* hallucinated refund/compensation rate
+* multi-turn state tracking
+* policy/context grounding
+* payment/COD realism
+* privacy & PII safety
+* toxic customer handling quality
+* escalation/de-escalation quality
+
+## Các model dự kiến benchmark
+
+* EmpathAI v1
+* EmpathAI v2
+* Llama 3.1 8B Instruct
+* Qwen Instruct
+* Gemini Flash-class models
+
+---
+
+# 📊 Thông số kỹ thuật
+
+| Thành phần              | Chi tiết                 |
+| :---------------------- | :----------------------- |
+| **Mô hình gốc**         | `Llama-3.1-8B-Instruct`  |
+| **Kiến trúc**           | QLoRA / DPO              |
+| **Hạ tầng huấn luyện**  | Google Cloud Vertex AI   |
+| **GPU sử dụng**         | NVIDIA L4 / RTX PRO 6000 |
+| **Pipeline huấn luyện** | SFT + DPO                |
+| **Tối ưu hóa**          | Unsloth                  |
+
+---
+
+# 🌿 Branches
+
+| Branch    | Mô tả                                                          |
+| --------- | -------------------------------------------------------------- |
+| `main`    | bản inference-ready 4-bit mới nhất (stable release mặc định)   |
+| `v1-bf16` | full-quality BF16 weights của EmpathAI v1                      |
+| `v1-4bit` | phiên bản 4-bit của EmpathAI v1                                |
+| `v1-gguf` | GGUF export của EmpathAI v1 cho llama.cpp / LM Studio / Ollama |
+| `v2-bf16` | full-quality BF16 weights của EmpathAI v2                      |
+| `v2-4bit` | phiên bản 4-bit của EmpathAI v2                                |
+| `v2-gguf` | GGUF export của EmpathAI v2 cho llama.cpp / LM Studio / Ollama |
+
+---
+
+# 🚀 Hướng dẫn sử dụng
+
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_id = "thanhhoangnvbg/empathAI-llama3.1-8b"
+
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+
+messages = [
+    {
+        "role": "system",
+        "content": """Bạn là EmpathAI, chuyên viên CSKH e-commerce tiếng Việt.
+
+Nguyên tắc:
+- Không tự bịa trạng thái đơn hàng.
+- Không tự hứa hoàn tiền/voucher khi chưa có căn cứ.
+- Nếu thiếu dữ liệu, hãy yêu cầu thêm thông tin.
+- Giữ giọng điệu bình tĩnh và chuyên nghiệp."""
+    },
+    {
+        "role": "user",
+        "content": "Đơn tôi giao trễ 5 ngày rồi đấy."
+    }
+]
+
+inputs = tokenizer.apply_chat_template(
+    messages,
+    add_generation_prompt=True,
+    return_tensors="pt"
+).to("cuda")
+
+outputs = model.generate(
+    inputs,
+    max_new_tokens=128,
+    temperature=0.5
+)
+
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+---
+
+# 🦙 GGUF / Local Inference
+
+EmpathAI hỗ trợ đầy đủ GGUF để chạy local inference với:
+
+* Ollama
+* llama.cpp
+* LM Studio
+* KoboldCpp
+* OpenWebUI
+
+## Available Quantizations
+
+| File          | Recommended Use                             |
+| :------------ | :------------------------------------------ |
+| `Q4_K_M.gguf` | Cân bằng tốt giữa chất lượng và tốc độ      |
+| `Q5_K_M.gguf` | Chất lượng cao hơn, dùng nhiều VRAM/RAM hơn |
+
+---
+
+# 🚀 Chạy với Ollama
+
+Tạo `Modelfile`:
+
+```text
+FROM ./empathAI-llama3.1-8b.Q4_K_M.gguf
+
+TEMPLATE """{{ .Prompt }}"""
+
+PARAMETER temperature 0.5
+PARAMETER num_ctx 4096
+```
+
+Build model:
+
+```bash
+ollama create empathai -f Modelfile
+```
+
+Run:
+
+```bash
+ollama run empathai
+```
+
+---
+
+# 🚀 Chạy với llama.cpp
+
+```bash
+./llama-cli \
+--model empathAI-llama3.1-8b.Q4_K_M.gguf \
+-p "Xin chào"
+```
+
+---
+
+# 💻 Recommended Hardware
+
+| Quant  | RAM / VRAM khuyến nghị |
+| :----- | :--------------------- |
+| Q4_K_M | ~8GB+                  |
+| Q5_K_M | ~10GB+                 |
+
+---
+
+# 🎯 Mục tiêu của project
+
+EmpathAI không hướng tới:
+
+* reasoning tổng quát
+* coding assistant
+* general-purpose chatbot
+
+Mục tiêu chính là:
+
+* realistic Vietnamese customer-support workflow
+* toxic customer handling
+* de-escalation
+* policy-aware support
+* safer e-commerce interactions
+
+---
+
+# 🔥 Ghi chú
+
+Project hiện đang được maintain và cải tiến liên tục thông qua:
+
+* dataset cleaning
+* synthetic data generation
+* DPO refinement
+* benchmark evaluation
+* safety-focused iteration
+
+Các release mới sẽ tập trung vào:
+
+* giảm hallucination
+* cải thiện workflow thực tế
+* tăng khả năng multi-turn
+* tăng độ ổn định khi dùng với RAG/tool systems
+* tăng realism cho toxic e-commerce support