Files
Unichat-llama3.1-Chinese-8B/.ipynb_checkpoints/README-checkpoint.md
2024-07-25 08:08:29 +00:00

77 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
frameworks:
- Pytorch
license: Apache License 2.0
tasks:
- text-classification
#model-type:
##如 gpt、phi、llama、chatglm、baichuan 等
- llama
#domain:
##如 nlp、cv、audio、multi-modal
- nlp
#language:
##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
- cn
- zh
#metrics:
##如 CIDEr、Blue、ROUGE 等
#- CIDEr
#tags:
##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
- fine-tuned
#tools:
##如 vllm、fastchat、llamacpp、AdaSeq 等
- vllm
---
# Unichat-llama3-Chinese-8B
## 介绍
* 中国联通AI创新中心发布llama3.1中文指令微调模型,全参数微调
* 基础模型 [**Meta-Llama-3.1-8B-Instruct**](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
### 📊 数据
- 高质量指令数据,覆盖多个领域和行业,为模型训练提供充足的数据支持
- 微调指令数据经过严格的人工筛查,保证优质的指令数据用于模型微调
```python
import transformers
import torch
model_id = "UnicomAI/Unichat-llama3.1-Chinese-8B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "你是谁?"},
]
outputs = pipeline(
messages,
max_new_tokens=1024,
do_sample=False,
repetition_penalty=1.1,
)
print(outputs[0]["generated_text"][-1])
```
## 资源
更多模型,数据集和训练相关细节请参考:
* Github[**Unichat-llama3-Chinese**](https://github.com/UnicomAI/Unichat-llama3-Chinese)