Files
CFGPT1-pt-7B/README.md
ModelHub XC 1f52e680d4 初始化项目,由ModelHub XC社区提供模型
Model: TongjiFinLab/CFGPT1-pt-7B
Source: Original Platform
2026-05-24 23:40:20 +08:00

40 lines
2.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: apache-2.0
language:
- zh
pipeline_tag: text-generation
---
---
<div style="text-align:center">
<!-- <img src="https://big-cheng.com/k2/k2.png" alt="k2-logo" width="200"/> -->
<h2>📈 CFGPT: Chinese Financial Assistant with Large Language Model (CFGPT1-pt-7b)</h2>
</div>
## Introduction
We introduce **CFGPT**, an open-source language model trained by firstly further pretraining general LLMs on collected and cleaned Chinese finance text data (CFData-pt), including financial domain-specific data (announcement, finance articles, finance exams, finance news, finance research papers) and general data (Wikipedia), and secondly fine-tuning with knowledge-intensive instruction tuning data (CFData-sft).
As for preliminary evaluation, we use CFBenchmark-Basic.
CFGPT outperforms the baselines on objective and subjective tasks compared to several baseline models with similar parameters.
In this repository, we will share the further pretrained model.
- [Pretrained Model](https://huggingface.co/TongjiFinLab/CFGPT1-pt-7B): Full model weights after further pretraining with the chinese finance text corpus to comply with the InternLM model license.
## How to Use
The CFGPT1-pt-7b is a pre-trained model, which has not undergone supervised fine-tuning with a instruction data. Therefore, it is not advisable to use this model for financial tasks.
Please refer to [CFGPT]() Github repo for further usage.
## 简介
**CFGPT**是一个开源的语言模型首先通过在收集和清理的中国金融文本数据CFData-pt上进行继续预训练包括金融领域特定数据公告、金融文章、金融考试、金融新闻、金融研究论文和通用数据维基百科然后使用知识密集的指导调整数据CFData-sft进行微调。
我们使用CFBenchmark-Basic进行初步评估。与几个具有相似参数的基线模型相比CFGPT在识别分类和生成任务上表现优越。
在这个仓库中,我们将分享以下继续预训练的模型。
- [Pretrained Model](https://huggingface.co/TongjiFinLab/CFGPT1-pt-7B): 在中国金融文本语料库上进行进一步预训练且符合InternLM模型许可的完整模型权重。
## 如何使用
这个模型是一个预训练的模型,还没有经历过指令数据库的有监督微调,因此不建议使用该模型执行相关金融任务。
具体使用,请参考[CFGPT]()的Github仓库。