TourismReview-Qwen2.5-7B/README.md

---
language:
- zh
- en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- tourism
- cultural-heritage
- review-analysis
- qwen2
- lora
- chinese
base_model: Qwen/Qwen2.5-7B-Instruct
model_name: TourismReview-Qwen2.5-7B
---

# TourismReview-Qwen2.5-7B

## 中文介绍

TourismReview-Qwen2.5-7B 是一个面向旅游研究场景的大语言模型，基于 **Qwen2.5-7B-Instruct** 进行微调，主要用于旅游评论文本的内容分析与多维度评分任务。

本模型重点服务于文化遗产旅游、旅游体验评价、游客认知分析、UGC文本挖掘等研究场景，可用于对游客评论进行结构化解析，并输出统一格式的多维评分结果。

本仓库发布的模型名称为：

**`dadaguai6677/TourismReview-Qwen2.5-7B`**

---

## English Introduction

TourismReview-Qwen2.5-7B is a domain-adapted large language model for tourism research, built upon **Qwen2.5-7B-Instruct** and fine-tuned for tourism review analysis.

It is designed for research scenarios such as cultural heritage tourism, visitor perception analysis, tourism experience evaluation, and user-generated content mining. The model can transform tourism reviews into structured multi-dimensional rating outputs in a consistent format.

The released repository name is:

**`dadaguai6677/TourismReview-Qwen2.5-7B`**

---

# 模型信息 | Model Details

## 中文

- **模型名称**：TourismReview-Qwen2.5-7B
- **基础模型**：Qwen/Qwen2.5-7B-Instruct
- **模型架构**：Qwen2ForCausalLM
- **任务类型**：文本生成 / 评论分析 / 多维评分
- **适用语言**：中文为主，兼容英文说明
- **应用方向**：旅游评论分析、文化遗产旅游研究、游客感知评价

## English

- **Model Name**: TourismReview-Qwen2.5-7B
- **Base Model**: Qwen/Qwen2.5-7B-Instruct
- **Architecture**: Qwen2ForCausalLM
- **Task Type**: Text generation / Review analysis / Multi-dimensional scoring
- **Primary Language**: Chinese, with English documentation support
- **Domain**: Tourism review analysis, cultural heritage tourism research, visitor perception evaluation

---

# 适用任务 | Intended Use

## 中文

本模型适用于以下任务：

- 旅游评论内容分析
- 游客感知价值识别
- 文化遗产旅游体验评价
- 多维文本结构化打分
- 旅游研究中的辅助编码与大规模文本处理

## English

This model is intended for:

- tourism review content analysis
- visitor perceived value assessment
- cultural heritage tourism experience evaluation
- structured multi-dimensional scoring
- large-scale text processing for tourism research

---

# 使用方法 | How to Use

## 重要说明（请务必阅读）| Important Note

### 中文

为了尽可能复现本研究中的使用效果，请尽量保持与原始调用方式一致，包括：

1. 使用与本研究一致的 system prompt
2. 使用相同的 user prompt 结构
3. 保持 11 个评价维度的顺序不变
4. 保持输出格式完全一致
5. 推理参数建议保持：
   - `max_new_tokens=128`
   - `do_sample=False`
   - `num_beams=1`

本模型在研究中并不是用于开放式闲聊，而是用于**结构化旅游评论分析任务**。如果更改提示词表述或维度顺序，输出效果可能与本研究结果不一致。

### English

To reproduce the behavior used in this research as closely as possible, please keep the original inference setup unchanged, including:

1. the same system prompt
2. the same user prompt structure
3. the same order of the 11 evaluation dimensions
4. the exact same output format
5. the same inference parameters:
   - `max_new_tokens=128`
   - `do_sample=False`
   - `num_beams=1`

This model was not primarily designed for open-ended chatting. It was used for **structured tourism review analysis**. Changing the prompt wording or the dimension order may lead to outputs that differ from the results reported in the research. :contentReference[oaicite:1]{index=1}

---

## 调用代码示例：

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "dadaguai6677/TourismReview-Qwen2.5-7B"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True,
    padding_side="left"
)

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
    low_cpu_mem_usage=True
)

model.eval()

def create_prompt(text):
    system_msg = "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."
    prompt = f"""<|im_start|>system
{system_msg}<|im_end|>
<|im_start|>user
请对以下旅游评论进行内容分析，并基于以下11个维度进行打分。每个维度评分等级为1-5分，如未提及则返回null。

评分标准：
1分：完全不同意
2分：不同意
3分：一般
4分：同意
5分：完全同意
null：未提及

评价维度：
放松惬意,乐趣满足,餐饮良好,购物丰富,娱乐活动多,交通便捷,服务友好,环境整洁,学习文化,体验工艺,家庭友好

请严格按以下格式返回结果：
放松惬意:分数,乐趣满足:分数,餐饮良好:分数,购物丰富:分数,娱乐活动多:分数,交通便捷:分数,服务友好:分数,环境整洁:分数,学习文化:分数,体验工艺:分数,家庭友好:分数

待分析文本：
{text}<|im_end|>
<|im_start|>assistant
"""
    return prompt

text = "景区环境很好，讲解也比较细致，孩子能学到很多历史文化知识，就是周边餐饮一般。"

prompt = create_prompt(text)

inputs = tokenizer(
    prompt,
    return_tensors="pt",
    truncation=True,
    max_length=1024,
    padding=True
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=128,
        do_sample=False,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
        num_beams=1
    )

full_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

if "<|im_start|>assistant" in full_text:
    response = full_text.split("<|im_start|>assistant")[-1].strip()
else:
    response = full_text

print(response)