Files
ModelHub XC 25ac8357f8 初始化项目,由ModelHub XC社区提供模型
Model: thanhdo881/qwen2.5-3b-vivu-travel-vn
Source: Original Platform
2026-05-04 12:36:27 +08:00

64 lines
2.3 KiB
Markdown

---
language:
- vi
- en
pipeline_tag: text-generation
tags:
- qwen
- qwen2.5
- slm
- RAG
- travel
- vietnamese
- unsloth
- anti-hallucination
base_model: Qwen/Qwen2.5-3B-Instruct
---
# qwen2.5-3b-vivu-travel-vn
## Overview
`qwen2.5-3b-vivu-travel-vn` is a 3B-parameter Small Language Model (SLM) fine-tuned for the **Vietnamese Tourism Domain**. Built on `Qwen2.5-3B-Instruct` using Unsloth (PEFT/LoRA), it acts as **ViVu**, an intelligent travel assistant optimized for **Advanced RAG** pipelines.
### Key Features
* **Strict Anti-Hallucination:** Zero-tolerance for fabrication; strictly grounds answers in the retrieved context and politely declines out-of-scope queries.
* **RAG-Optimized:** Perfectly synthesizes Vector DB chunks into clean, structured Vietnamese (Markdown supported).
* **Resource Efficient:** Deployable on consumer-grade GPUs (e.g., RTX 3060, T4) with low VRAM footprint.
## Model Details
* **Base Model:** Qwen/Qwen2.5-3B-Instruct
* **Architecture:** Causal LM, 32k context length.
* **Training Method:** LoRA Instruction-tuning via Unsloth.
* **Language:** Vietnamese, English.
## Quickstart
```bash
pip install transformers vllm accelerate
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "thanhdo881/qwen2.5-3b-vivu-travel-vn"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
# 1. Prepare RAG Context & Query
context = "Đà Lạt nằm trên cao nguyên Lâm Viên, nổi tiếng với khí hậu ôn đới và Hồ Xuân Hương."
question = "Đà Lạt có những đặc điểm gì nổi bật?"
prompt = f"Dựa vào thông tin sau:\n{context}\n\nHãy trả lời câu hỏi: {question}"
# 2. Build Messages
messages = [
{"role": "system", "content": "Bạn là ViVu, trợ lý du lịch Việt Nam. Chỉ trả lời dựa trên ngữ cảnh được cung cấp."},
{"role": "user", "content": prompt}
]
# 3. Generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.3, repetition_penalty=1.1)
response = tokenizer.batch_decode(outputs[:, inputs.input_ids.shape[1]:], skip_special_tokens=True)[0]
print(response)