初始化项目,由ModelHub XC社区提供模型
Model: dadaguai6677/TourismReview-Qwen2.5-7B Source: Original Platform
This commit is contained in:
210
README.md
Normal file
210
README.md
Normal file
@@ -0,0 +1,210 @@
|
||||
---
|
||||
language:
|
||||
- zh
|
||||
- en
|
||||
license: apache-2.0
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- tourism
|
||||
- cultural-heritage
|
||||
- review-analysis
|
||||
- qwen2
|
||||
- lora
|
||||
- chinese
|
||||
base_model: Qwen/Qwen2.5-7B-Instruct
|
||||
model_name: TourismReview-Qwen2.5-7B
|
||||
---
|
||||
|
||||
# TourismReview-Qwen2.5-7B
|
||||
|
||||
## 中文介绍
|
||||
|
||||
TourismReview-Qwen2.5-7B 是一个面向旅游研究场景的大语言模型,基于 **Qwen2.5-7B-Instruct** 进行微调,主要用于旅游评论文本的内容分析与多维度评分任务。
|
||||
|
||||
本模型重点服务于文化遗产旅游、旅游体验评价、游客认知分析、UGC文本挖掘等研究场景,可用于对游客评论进行结构化解析,并输出统一格式的多维评分结果。
|
||||
|
||||
本仓库发布的模型名称为:
|
||||
|
||||
**`dadaguai6677/TourismReview-Qwen2.5-7B`**
|
||||
|
||||
---
|
||||
|
||||
## English Introduction
|
||||
|
||||
TourismReview-Qwen2.5-7B is a domain-adapted large language model for tourism research, built upon **Qwen2.5-7B-Instruct** and fine-tuned for tourism review analysis.
|
||||
|
||||
It is designed for research scenarios such as cultural heritage tourism, visitor perception analysis, tourism experience evaluation, and user-generated content mining. The model can transform tourism reviews into structured multi-dimensional rating outputs in a consistent format.
|
||||
|
||||
The released repository name is:
|
||||
|
||||
**`dadaguai6677/TourismReview-Qwen2.5-7B`**
|
||||
|
||||
---
|
||||
|
||||
# 模型信息 | Model Details
|
||||
|
||||
## 中文
|
||||
|
||||
- **模型名称**:TourismReview-Qwen2.5-7B
|
||||
- **基础模型**:Qwen/Qwen2.5-7B-Instruct
|
||||
- **模型架构**:Qwen2ForCausalLM
|
||||
- **任务类型**:文本生成 / 评论分析 / 多维评分
|
||||
- **适用语言**:中文为主,兼容英文说明
|
||||
- **应用方向**:旅游评论分析、文化遗产旅游研究、游客感知评价
|
||||
|
||||
## English
|
||||
|
||||
- **Model Name**: TourismReview-Qwen2.5-7B
|
||||
- **Base Model**: Qwen/Qwen2.5-7B-Instruct
|
||||
- **Architecture**: Qwen2ForCausalLM
|
||||
- **Task Type**: Text generation / Review analysis / Multi-dimensional scoring
|
||||
- **Primary Language**: Chinese, with English documentation support
|
||||
- **Domain**: Tourism review analysis, cultural heritage tourism research, visitor perception evaluation
|
||||
|
||||
---
|
||||
|
||||
# 适用任务 | Intended Use
|
||||
|
||||
## 中文
|
||||
|
||||
本模型适用于以下任务:
|
||||
|
||||
- 旅游评论内容分析
|
||||
- 游客感知价值识别
|
||||
- 文化遗产旅游体验评价
|
||||
- 多维文本结构化打分
|
||||
- 旅游研究中的辅助编码与大规模文本处理
|
||||
|
||||
## English
|
||||
|
||||
This model is intended for:
|
||||
|
||||
- tourism review content analysis
|
||||
- visitor perceived value assessment
|
||||
- cultural heritage tourism experience evaluation
|
||||
- structured multi-dimensional scoring
|
||||
- large-scale text processing for tourism research
|
||||
|
||||
---
|
||||
|
||||
# 使用方法 | How to Use
|
||||
|
||||
## 重要说明(请务必阅读)| Important Note
|
||||
|
||||
### 中文
|
||||
|
||||
为了尽可能复现本研究中的使用效果,请尽量保持与原始调用方式一致,包括:
|
||||
|
||||
1. 使用与本研究一致的 system prompt
|
||||
2. 使用相同的 user prompt 结构
|
||||
3. 保持 11 个评价维度的顺序不变
|
||||
4. 保持输出格式完全一致
|
||||
5. 推理参数建议保持:
|
||||
- `max_new_tokens=128`
|
||||
- `do_sample=False`
|
||||
- `num_beams=1`
|
||||
|
||||
本模型在研究中并不是用于开放式闲聊,而是用于**结构化旅游评论分析任务**。如果更改提示词表述或维度顺序,输出效果可能与本研究结果不一致。
|
||||
|
||||
### English
|
||||
|
||||
To reproduce the behavior used in this research as closely as possible, please keep the original inference setup unchanged, including:
|
||||
|
||||
1. the same system prompt
|
||||
2. the same user prompt structure
|
||||
3. the same order of the 11 evaluation dimensions
|
||||
4. the exact same output format
|
||||
5. the same inference parameters:
|
||||
- `max_new_tokens=128`
|
||||
- `do_sample=False`
|
||||
- `num_beams=1`
|
||||
|
||||
This model was not primarily designed for open-ended chatting. It was used for **structured tourism review analysis**. Changing the prompt wording or the dimension order may lead to outputs that differ from the results reported in the research. :contentReference[oaicite:1]{index=1}
|
||||
|
||||
---
|
||||
|
||||
## 调用代码示例:
|
||||
|
||||
```python
|
||||
import torch
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
model_id = "dadaguai6677/TourismReview-Qwen2.5-7B"
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(
|
||||
model_id,
|
||||
trust_remote_code=True,
|
||||
padding_side="left"
|
||||
)
|
||||
|
||||
if tokenizer.pad_token is None:
|
||||
tokenizer.pad_token = tokenizer.eos_token
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_id,
|
||||
torch_dtype=torch.float16,
|
||||
device_map="auto",
|
||||
trust_remote_code=True,
|
||||
low_cpu_mem_usage=True
|
||||
)
|
||||
|
||||
model.eval()
|
||||
|
||||
def create_prompt(text):
|
||||
system_msg = "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."
|
||||
prompt = f"""<|im_start|>system
|
||||
{system_msg}<|im_end|>
|
||||
<|im_start|>user
|
||||
请对以下旅游评论进行内容分析,并基于以下11个维度进行打分。每个维度评分等级为1-5分,如未提及则返回null。
|
||||
|
||||
评分标准:
|
||||
1分:完全不同意
|
||||
2分:不同意
|
||||
3分:一般
|
||||
4分:同意
|
||||
5分:完全同意
|
||||
null:未提及
|
||||
|
||||
评价维度:
|
||||
放松惬意,乐趣满足,餐饮良好,购物丰富,娱乐活动多,交通便捷,服务友好,环境整洁,学习文化,体验工艺,家庭友好
|
||||
|
||||
请严格按以下格式返回结果:
|
||||
放松惬意:分数,乐趣满足:分数,餐饮良好:分数,购物丰富:分数,娱乐活动多:分数,交通便捷:分数,服务友好:分数,环境整洁:分数,学习文化:分数,体验工艺:分数,家庭友好:分数
|
||||
|
||||
待分析文本:
|
||||
{text}<|im_end|>
|
||||
<|im_start|>assistant
|
||||
"""
|
||||
return prompt
|
||||
|
||||
text = "景区环境很好,讲解也比较细致,孩子能学到很多历史文化知识,就是周边餐饮一般。"
|
||||
|
||||
prompt = create_prompt(text)
|
||||
|
||||
inputs = tokenizer(
|
||||
prompt,
|
||||
return_tensors="pt",
|
||||
truncation=True,
|
||||
max_length=1024,
|
||||
padding=True
|
||||
).to(model.device)
|
||||
|
||||
with torch.no_grad():
|
||||
outputs = model.generate(
|
||||
**inputs,
|
||||
max_new_tokens=128,
|
||||
do_sample=False,
|
||||
pad_token_id=tokenizer.pad_token_id,
|
||||
eos_token_id=tokenizer.eos_token_id,
|
||||
num_beams=1
|
||||
)
|
||||
|
||||
full_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
||||
|
||||
if "<|im_start|>assistant" in full_text:
|
||||
response = full_text.split("<|im_start|>assistant")[-1].strip()
|
||||
else:
|
||||
response = full_text
|
||||
|
||||
print(response)
|
||||
Reference in New Issue
Block a user