Files
treeswift-90m/README.md
ModelHub XC 618ccf35f5 初始化项目,由ModelHub XC社区提供模型
Model: qikp/treeswift-90m
Source: Original Platform
2026-05-24 11:19:35 +08:00

1.0 KiB

license, datasets, language, base_model, pipeline_tag, library_name
license datasets language base_model pipeline_tag library_name
apache-2.0
HuggingFaceH4/no_robots
HuggingFaceTB/everyday-conversations-llama3.1-2k
en
distilbert/distilgpt2
text-generation transformers

Treeswift

Treeswift is a derivative of DistilGPT2 trained to be conversational. It is also designed to be similar to GPT-3.5.

Training

The model was trained using 2,750 steps, and 4 batch size.

Datasets

The training corpus is made up of:

The train / train_sft splits were used.

Chat template

The Zephyr chat template was used, but most notably, chat template tokens were added to enhance performance.

Limitations

The model frequently outputs incorrect information, confirmation with a larger, mature model is advised. In addition, it may subtly repeat.