初始化项目,由ModelHub XC社区提供模型
Model: TongjiFinLab/CFGPT1-pt-7B Source: Original Platform
This commit is contained in:
40
README.md
Normal file
40
README.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
language:
|
||||
- zh
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
---
|
||||
<div style="text-align:center">
|
||||
<!-- <img src="https://big-cheng.com/k2/k2.png" alt="k2-logo" width="200"/> -->
|
||||
<h2>📈 CFGPT: Chinese Financial Assistant with Large Language Model (CFGPT1-pt-7b)</h2>
|
||||
</div>
|
||||
|
||||
## Introduction
|
||||
|
||||
We introduce **CFGPT**, an open-source language model trained by firstly further pretraining general LLMs on collected and cleaned Chinese finance text data (CFData-pt), including financial domain-specific data (announcement, finance articles, finance exams, finance news, finance research papers) and general data (Wikipedia), and secondly fine-tuning with knowledge-intensive instruction tuning data (CFData-sft).
|
||||
As for preliminary evaluation, we use CFBenchmark-Basic.
|
||||
CFGPT outperforms the baselines on objective and subjective tasks compared to several baseline models with similar parameters.
|
||||
|
||||
In this repository, we will share the further pretrained model.
|
||||
|
||||
- [Pretrained Model](https://huggingface.co/TongjiFinLab/CFGPT1-pt-7B): Full model weights after further pretraining with the chinese finance text corpus to comply with the InternLM model license.
|
||||
|
||||
## How to Use
|
||||
|
||||
The CFGPT1-pt-7b is a pre-trained model, which has not undergone supervised fine-tuning with a instruction data. Therefore, it is not advisable to use this model for financial tasks.
|
||||
Please refer to [CFGPT]() Github repo for further usage.
|
||||
|
||||
## 简介
|
||||
|
||||
**CFGPT**是一个开源的语言模型,首先通过在收集和清理的中国金融文本数据(CFData-pt)上进行继续预训练,包括金融领域特定数据(公告、金融文章、金融考试、金融新闻、金融研究论文)和通用数据(维基百科),然后使用知识密集的指导调整数据(CFData-sft)进行微调。
|
||||
我们使用CFBenchmark-Basic进行初步评估。与几个具有相似参数的基线模型相比,CFGPT在识别,分类和生成任务上表现优越。
|
||||
|
||||
在这个仓库中,我们将分享以下继续预训练的模型。
|
||||
|
||||
- [Pretrained Model](https://huggingface.co/TongjiFinLab/CFGPT1-pt-7B): 在中国金融文本语料库上进行进一步预训练且符合InternLM模型许可的完整模型权重。
|
||||
|
||||
## 如何使用
|
||||
|
||||
这个模型是一个预训练的模型,还没有经历过指令数据库的有监督微调,因此不建议使用该模型执行相关金融任务。
|
||||
具体使用,请参考[CFGPT]()的Github仓库。
|
||||
Reference in New Issue
Block a user