ModelHub XC 1f52e680d4 初始化项目,由ModelHub XC社区提供模型
Model: TongjiFinLab/CFGPT1-pt-7B
Source: Original Platform
2026-05-24 23:40:20 +08:00

license, language, pipeline_tag
license language pipeline_tag
apache-2.0
zh
text-generation

📈 CFGPT: Chinese Financial Assistant with Large Language Model (CFGPT1-pt-7b)

## Introduction We introduce CFGPT, an open-source language model trained by firstly further pretraining general LLMs on collected and cleaned Chinese finance text data (CFData-pt), including financial domain-specific data (announcement, finance articles, finance exams, finance news, finance research papers) and general data (Wikipedia), and secondly fine-tuning with knowledge-intensive instruction tuning data (CFData-sft). As for preliminary evaluation, we use CFBenchmark-Basic. CFGPT outperforms the baselines on objective and subjective tasks compared to several baseline models with similar parameters. In this repository, we will share the further pretrained model. - Pretrained Model: Full model weights after further pretraining with the chinese finance text corpus to comply with the InternLM model license. ## How to Use The CFGPT1-pt-7b is a pre-trained model, which has not undergone supervised fine-tuning with a instruction data. Therefore, it is not advisable to use this model for financial tasks. Please refer to CFGPT Github repo for further usage. ## 简介 CFGPT是一个开源的语言模型首先通过在收集和清理的中国金融文本数据CFData-pt上进行继续预训练包括金融领域特定数据公告、金融文章、金融考试、金融新闻、金融研究论文和通用数据维基百科然后使用知识密集的指导调整数据CFData-sft进行微调。 我们使用CFBenchmark-Basic进行初步评估。与几个具有相似参数的基线模型相比CFGPT在识别分类和生成任务上表现优越。 在这个仓库中,我们将分享以下继续预训练的模型。 - Pretrained Model: 在中国金融文本语料库上进行进一步预训练且符合InternLM模型许可的完整模型权重。 ## 如何使用 这个模型是一个预训练的模型,还没有经历过指令数据库的有监督微调,因此不建议使用该模型执行相关金融任务。 具体使用,请参考CFGPT的Github仓库。
Description
Model synced from source: TongjiFinLab/CFGPT1-pt-7B
Readme 48 KiB
Languages
Python 100%