license, datasets, language, base_model, pipeline_tag, library_name
license datasets language base_model pipeline_tag library_name
apache-2.0
HuggingFaceH4/no_robots
HuggingFaceTB/everyday-conversations-llama3.1-2k
en
distilbert/distilgpt2
text-generation transformers

Treeswift

Treeswift is a derivative of DistilGPT2 trained to be conversational. It is also designed to be similar to GPT-3.5.

Training

The model was trained using 2,750 steps, and 4 batch size.

Datasets

The training corpus is made up of:

The train / train_sft splits were used.

Chat template

The Zephyr chat template was used, but most notably, chat template tokens were added to enhance performance.

Limitations

The model frequently outputs incorrect information, confirmation with a larger, mature model is advised. In addition, it may subtly repeat.

Description
Model synced from source: qikp/treeswift-90m
Readme 767 KiB
Languages
Jinja 100%