first commit

2024-07-25 08:08:29 +00:00
parent dbda0c5796
commit 2e044bdfa8
13 changed files with 413071 additions and 21 deletions
--- a/.ipynb_checkpoints/README-checkpoint.md
+++ b/.ipynb_checkpoints/README-checkpoint.md
@@ -0,0 +1,77 @@
+---
+frameworks:
+- Pytorch
+license: Apache License 2.0
+tasks:
+- text-classification
+
+#model-type:
+##如 gpt、phi、llama、chatglm、baichuan 等
+- llama
+
+#domain:
+##如 nlp、cv、audio、multi-modal
+- nlp
+
+#language:
+##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
+- cn 
+- zh
+
+#metrics:
+##如 CIDEr、Blue、ROUGE 等
+#- CIDEr
+
+#tags:
+##各种自定义，包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
+- fine-tuned
+
+#tools:
+##如 vllm、fastchat、llamacpp、AdaSeq 等
+- vllm
+
+---
+# Unichat-llama3-Chinese-8B
+
+
+## 介绍
+* 中国联通AI创新中心发布llama3.1中文指令微调模型，全参数微调
+* 基础模型 [**Meta-Llama-3.1-8B-Instruct**](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
+
+
+### 📊 数据
+- 高质量指令数据，覆盖多个领域和行业，为模型训练提供充足的数据支持
+- 微调指令数据经过严格的人工筛查，保证优质的指令数据用于模型微调
+
+
+```python
+import transformers
+import torch
+
+model_id = "UnicomAI/Unichat-llama3.1-Chinese-8B"
+
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_id,
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device_map="auto",
+)
+
+messages = [
+    {"role": "system", "content": "You are a helpful assistant"},
+    {"role": "user", "content": "你是谁?"},
+]
+
+outputs = pipeline(
+    messages,
+    max_new_tokens=1024,
+    do_sample=False,
+    repetition_penalty=1.1,
+)
+print(outputs[0]["generated_text"][-1])
+```
+
+## 资源
+更多模型,数据集和训练相关细节请参考:
+* Github：[**Unichat-llama3-Chinese**](https://github.com/UnicomAI/Unichat-llama3-Chinese)
+