初始化项目，由ModelHub XC社区提供模型

Model: thanhhoangnvbg/empathAI-llama3.1-8b Source: Original Platform
2026-05-31 20:45:21 +08:00
commit 332f1c539e
9 changed files with 545 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,36 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,394 @@
 ---
 license: mit
 language:
  - vi
 pipeline_tag: text-generation
 library_name: transformers
 base_model: "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
 datasets:
  - "thanhhoangnvbg/empathAI-dpo-vi"
 tags:
  - transformers
  - unsloth
  - llama
  - llama-3
  - gguf
  - "llama.cpp"
  - ollama
  - "lm-studio"
  - koboldcpp
  - rag
  - rag-ready
  - trl
  - sft
  - dpo
  - empathy
  - customer-service
  - vietnamese
  - vi
  - chatbot
  - emotional-support
  - customer-support
  - ecommerce
  - toxic-customer-support
 ---
 ## pipeline_tag: text-generation
 # 🧠 EmpathAI - Llama 3.1 8B
 ### Vietnamese Toxic E-commerce Customer Support Model
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 **EmpathAI** là mô hình LLM tiếng Việt được fine-tune chuyên biệt cho bài toán **chăm sóc khách hàng thương mại điện tử**, đặc biệt tập trung vào các tình huống khó như:
 * khách hàng toxic / tức giận
 * giao hàng trễ
 * thiếu hàng / sai hàng
 * sản phẩm lỗi hoặc hư hỏng
 * refund / đổi trả
 * payment / COD issues
 * escalation và de-escalation
 * xử lý theo policy và workflow thực tế
 Mục tiêu của EmpathAI là:
 * giảm hallucination trong CSKH
 * xử lý khách hàng toxic tự nhiên hơn
 * hoạt động tốt với RAG/tool systems
 * tăng realism cho workflow e-commerce tiếng Việt
 ---
 # 📌 Trạng thái hiện tại
 EmpathAI v2 hiện đang trong quá trình phát triển và đánh giá.
 Phiên bản v2 tập trung cải thiện:
 * multi-turn workflow
 * order-code handling
 * payment/COD edge cases
 * policy/context grounding
 * privacy & PII safety
 * tool-aware customer support
 * giảm hallucinated refund/order-status responses
 Phiên bản v1 hiện tại vẫn được giữ nguyên và tiếp tục khả dụng.
 ---
 # 🌟 Điểm nổi bật
 ## 💬 Emotional Intelligence
 EmpathAI được huấn luyện để:
 * xoa dịu khách hàng đang tức giận
 * tránh tranh cãi không cần thiết
 * giữ giọng điệu tự nhiên, không quá máy móc
 * đưa ra bước xử lý tiếp theo rõ ràng
 ---
 ## 🧩 RAG & Tool-Friendly
 Mô hình được thiết kế để hoạt động tốt với:
 * RAG pipelines
 * order lookup systems
 * internal customer-support tools
 EmpathAI biết:
 * khi nào cần hỏi mã đơn
 * khi nào cần yêu cầu thêm thông tin
 * khi nào chưa đủ dữ liệu để kết luận
 * tránh tự bịa trạng thái đơn hàng
 ---
 ## 🛡️ Safety & Grounding Focus
 EmpathAI v2 tập trung giảm:
 * hallucinated order status
 * unsupported refund promises
 * tự ý đề xuất voucher/bồi thường
 * yêu cầu PII không cần thiết
 * rò rỉ thông tin khách hàng khác
 ---
 # 📊 Tổng quan Dataset
 ## Dataset v1
 Dataset gốc chủ yếu tập trung vào:
 * toxic customer complaints
 * delayed delivery
 * refund/compensation
 * damaged/missing products
 * Vietnamese empathy/de-escalation
 ### Hạn chế của v1
 * phần lớn là single-turn
 * ít workflow tool-aware
 * ít payment/COD scenarios
 * ít privacy/security cases
 * còn tendency overpromise trong một số DPO pairs cũ
 ---
 ## Pipeline Dataset v2
 Pipeline dataset v2 hiện bao gồm:
 * cleaned old SFT pool
 * re-judged DPO preference pairs
 * synthetic toxic e-commerce conversations
 * multi-turn workflow generation
 * benchmark-oriented evaluation data
 ### Mục tiêu dataset
 * ~10k SFT samples
 * ~6k DPO pairs
 * benchmark eval riêng
 ---
 # 🏋️ Pipeline Training
 EmpathAI v2 sử dụng pipeline train 2 giai đoạn:
 ## Stage 1 — SFT
 Supervised fine-tuning trên:
 * chosen responses đã clean từ dataset cũ
 * synthetic customer-support conversations mới
 ---
 ## Stage 2 — DPO
 Direct Preference Optimization trên:
 * chosen/rejected pairs rõ ràng
 * DPO samples đã safety-filter và re-judge
 Các preference pair mơ hồ sẽ bị loại để tăng độ ổn định cho DPO.
 ---
 # 📈 Benchmark (Đang phát triển)
 Một benchmark riêng cho Vietnamese toxic e-commerce customer support hiện đang được xây dựng.
 ## Các nhóm đánh giá
 * hallucinated order-status rate
 * hallucinated refund/compensation rate
 * multi-turn state tracking
 * policy/context grounding
 * payment/COD realism
 * privacy & PII safety
 * toxic customer handling quality
 * escalation/de-escalation quality
 ## Các model dự kiến benchmark
 * EmpathAI v1
 * EmpathAI v2
 * Llama 3.1 8B Instruct
 * Qwen Instruct
 * Gemini Flash-class models
 ---
 # 📊 Thông số kỹ thuật
 | Thành phần              | Chi tiết                 |
 | :---------------------- | :----------------------- |
 | **Mô hình gốc**         | `Llama-3.1-8B-Instruct`  |
 | **Kiến trúc**           | QLoRA / DPO              |
 | **Hạ tầng huấn luyện**  | Google Cloud Vertex AI   |
 | **GPU sử dụng**         | NVIDIA L4 / RTX PRO 6000 |
 | **Pipeline huấn luyện** | SFT + DPO                |
 | **Tối ưu hóa**          | Unsloth                  |
 ---
 # 🌿 Branches
 | Branch    | Mô tả                                                          |
 | --------- | -------------------------------------------------------------- |
 | `main`    | bản inference-ready 4-bit mới nhất (stable release mặc định)   |
 | `v1-bf16` | full-quality BF16 weights của EmpathAI v1                      |
 | `v1-4bit` | phiên bản 4-bit của EmpathAI v1                                |
 | `v1-gguf` | GGUF export của EmpathAI v1 cho llama.cpp / LM Studio / Ollama |
 | `v2-bf16` | full-quality BF16 weights của EmpathAI v2                      |
 | `v2-4bit` | phiên bản 4-bit của EmpathAI v2                                |
 | `v2-gguf` | GGUF export của EmpathAI v2 cho llama.cpp / LM Studio / Ollama |
 ---
 # 🚀 Hướng dẫn sử dụng
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_id = "thanhhoangnvbg/empathAI-llama3.1-8b"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
 )
 messages = [
    {
        "role": "system",
        "content": """Bạn là EmpathAI, chuyên viên CSKH e-commerce tiếng Việt.
 Nguyên tắc:
 - Không tự bịa trạng thái đơn hàng.
 - Không tự hứa hoàn tiền/voucher khi chưa có căn cứ.
 - Nếu thiếu dữ liệu, hãy yêu cầu thêm thông tin.
 - Giữ giọng điệu bình tĩnh và chuyên nghiệp."""
    },
    {
        "role": "user",
        "content": "Đơn tôi giao trễ 5 ngày rồi đấy."
    }
 ]
 inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
 ).to("cuda")
 outputs = model.generate(
    inputs,
    max_new_tokens=128,
    temperature=0.5
 )
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ---
 # 🦙 GGUF / Local Inference
 EmpathAI hỗ trợ đầy đủ GGUF để chạy local inference với:
 * Ollama
 * llama.cpp
 * LM Studio
 * KoboldCpp
 * OpenWebUI
 ## Available Quantizations
 | File          | Recommended Use                             |
 | :------------ | :------------------------------------------ |
 | `Q4_K_M.gguf` | Cân bằng tốt giữa chất lượng và tốc độ      |
 | `Q5_K_M.gguf` | Chất lượng cao hơn, dùng nhiều VRAM/RAM hơn |
 ---
 # 🚀 Chạy với Ollama
 Tạo `Modelfile`:
 ```text
 FROM ./empathAI-llama3.1-8b.Q4_K_M.gguf
 TEMPLATE """{{ .Prompt }}"""
 PARAMETER temperature 0.5
 PARAMETER num_ctx 4096
 ```
 Build model:
 ```bash
 ollama create empathai -f Modelfile
 ```
 Run:
 ```bash
 ollama run empathai
 ```
 ---
 # 🚀 Chạy với llama.cpp
 ```bash
 ./llama-cli \
 --model empathAI-llama3.1-8b.Q4_K_M.gguf \
 -p "Xin chào"
 ```
 ---
 # 💻 Recommended Hardware
 | Quant  | RAM / VRAM khuyến nghị |
 | :----- | :--------------------- |
 | Q4_K_M | ~8GB+                  |
 | Q5_K_M | ~10GB+                 |
 ---
 # 🎯 Mục tiêu của project
 EmpathAI không hướng tới:
 * reasoning tổng quát
 * coding assistant
 * general-purpose chatbot
 Mục tiêu chính là:
 * realistic Vietnamese customer-support workflow
 * toxic customer handling
 * de-escalation
 * policy-aware support
 * safer e-commerce interactions
 ---
 # 🔥 Ghi chú
 Project hiện đang được maintain và cải tiến liên tục thông qua:
 * dataset cleaning
 * synthetic data generation
 * DPO refinement
 * benchmark evaluation
 * safety-focused iteration
 Các release mới sẽ tập trung vào:
 * giảm hallucination
 * cải thiện workflow thực tế
 * tăng khả năng multi-turn
 * tăng độ ổn định khi dùng với RAG/tool systems
 * tăng realism cho toxic e-commerce support
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,31 @@
 {% if messages[0]['role'] == 'system' %}
    {% set system_message = messages[0]['content'] | trim %}
    {% set messages = messages[1:] %}
 {% else %}
    {% set system_message = "Bạn là EmpathAI, trợ lý CSKH tiếng Việt giàu sự thấu cảm, chuyên xử lý khiếu nại và hỗ trợ khách hàng." %}
 {% endif %}
 <|start_header_id|>system<|end_header_id|>
 {{ system_message }}<|eot_id|>
 {% for message in messages %}
    {% if message['role'] == 'user' %}
 <|start_header_id|>user<|end_header_id|>
 {{ message['content'] | trim }}<|eot_id|>
    {% elif message['role'] == 'assistant' %}
 <|start_header_id|>assistant<|end_header_id|>
 {{ message['content'] | trim }}<|eot_id|>
    {% endif %}
 {% endfor %}
 {% if add_generation_prompt %}
 <|start_header_id|>assistant<|end_header_id|>
 {% endif %}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,34 @@
 {
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 128000,
  "eos_token_id": 128009,
  "head_dim": 128,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 131072,
  "mlp_bias": false,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "pad_token_id": 128004,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_theta": 500000.0,
  "rope_scaling": {
    "factor": 8.0,
    "high_freq_factor": 4.0,
    "low_freq_factor": 1.0,
    "original_max_position_embeddings": 8192,
    "rope_type": "llama3"
  },
  "tie_word_embeddings": false,
  "use_cache": true,
  "vocab_size": 128256
 }
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,14 @@
 {
  "bos_token_id": 128000,
  "do_sample": true,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],
  "max_length": 131072,
  "pad_token_id": 128004,
  "temperature": 0.6,
  "top_p": 0.9,
  "transformers_version": "5.7.0"
 }
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:76b0f4b8e797d63e4284320b10166f8dd65db8ae215dee61583540e76c0c4773
 size 16060556616
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,6 @@
 {
  "bos_token": "<|begin_of_text|>",
  "eos_token": "<|eot_id|>",
  "pad_token": "<|eot_id|>",
  "unk_token": "<|unk|>"
 }
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,24 @@
 {
  "bos_token": "<|begin_of_text|>",
  "eos_token": "<|eot_id|>",
  "pad_token": "<|eot_id|>",
  "unk_token": "<|unk|>",
  "padding_side": "right",
  "model_max_length": 8192,
  "model_input_names": [
    "input_ids",
    "attention_mask"
  ],
  "clean_up_tokenization_spaces": true,
  "tokenizer_class": "PreTrainedTokenizerFast",
  "add_bos_token": true,
  "add_eos_token": false,
  "chat_template": "{% if messages[0]['role'] == 'system' %}\n    {% set system_message = messages[0]['content'] | trim %}\n    {% set messages = messages[1:] %}\n{% else %}\n    {% set system_message = 'Bạn là EmpathAI, trợ lý CSKH tiếng Việt giàu sự thấu cảm, chuyên xử lý khiếu nại và hỗ trợ khách hàng.' %}\n{% endif %}\n\n<|start_header_id|>system<|end_header_id|>\n\n{{ system_message }}<|eot_id|>\n\n{% for message in messages %}\n\n    {% if message['role'] == 'user' %}\n<|start_header_id|>user<|end_header_id|>\n\n{{ message['content'] | trim }}<|eot_id|>\n\n    {% elif message['role'] == 'assistant' %}\n<|start_header_id|>assistant<|end_header_id|>\n\n{{ message['content'] | trim }}<|eot_id|>\n\n    {% endif %}\n\n{% endfor %}\n\n{% if add_generation_prompt %}\n<|start_header_id|>assistant<|end_header_id|>\n\n{% endif %}"
 }