Go to file

ModelHub XC 618ccf35f5 初始化项目，由ModelHub XC社区提供模型

Model: qikp/treeswift-90m
Source: Original Platform

2026-05-24 11:19:35 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

training_args.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-24 11:19:35 +08:00

README.md

license, datasets, language, base_model, pipeline_tag, library_name

license

datasets

language

base_model

pipeline_tag

library_name

apache-2.0

HuggingFaceH4/no_robots

HuggingFaceTB/everyday-conversations-llama3.1-2k

distilbert/distilgpt2

text-generation

transformers

Treeswift

Treeswift is a derivative of DistilGPT2 trained to be conversational. It is also designed to be similar to GPT-3.5.

Training

The model was trained using 2,750 steps, and 4 batch size.

Datasets

The training corpus is made up of:

The train / train_sft splits were used.

Chat template

The Zephyr chat template was used, but most notably, chat template tokens were added to enhance performance.

Limitations

The model frequently outputs incorrect information, confirmation with a larger, mature model is advised. In addition, it may subtly repeat.