Files
chinese-alpaca-plus-7b-merged/README.md
ModelHub XC 51860a5fa3 初始化项目,由ModelHub XC社区提供模型
Model: minlik/chinese-alpaca-plus-7b-merged
Source: Original Platform
2026-05-12 00:59:21 +08:00

58 lines
1.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: chinese-alpaca-plus-7b-merged
emoji: 📚
colorFrom: gray
colorTo: red
sdk: gradio
sdk_version: 3.23.0
app_file: app.py
pinned: false
---
加入中文词表并继续预训练中文Embedding并在此基础上继续使用指令数据集finetuning得到的中文Alpaca-plus模型。
详情可参考https://github.com/ymcui/Chinese-LLaMA-Alpaca/releases/tag/v3.0
### 使用方法参考
1. 安装模块包
```bash
pip install sentencepiece
pip install transformers>=4.28.0
```
2. 生成文本
```python
import torch
import transformers
from transformers import LlamaTokenizer, LlamaForCausalLM
def generate_prompt(text):
return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{text}
### Response:"""
tokenizer = LlamaTokenizer.from_pretrained('minlik/chinese-alpaca-plus-7b-merged')
model = LlamaForCausalLM.from_pretrained('minlik/chinese-alpaca-plus-7b-merged').half().to('cuda')
model.eval()
text = '第一个登上月球的人是谁?'
prompt = generate_prompt(text)
input_ids = tokenizer.encode(prompt, return_tensors='pt').to('cuda')
with torch.no_grad():
output_ids = model.generate(
input_ids=input_ids,
max_new_tokens=128,
temperature=1,
top_k=40,
top_p=0.9,
repetition_penalty=1.15
).cuda()
output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(output.replace(prompt, '').strip())
```