初始化项目，由ModelHub XC社区提供模型

Model: qikp/treeswift-90m Source: Original Platform
2026-05-24 11:19:35 +08:00
commit 618ccf35f5
9 changed files with 250506 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,37 @@
+---
+license: apache-2.0
+datasets:
+- HuggingFaceH4/no_robots
+- HuggingFaceTB/everyday-conversations-llama3.1-2k
+language:
+- en
+base_model:
+- distilbert/distilgpt2
+pipeline_tag: text-generation
+library_name: transformers
+---
+
+# Treeswift
+
+Treeswift is a derivative of DistilGPT2 trained to be conversational. It is also designed to be similar to GPT-3.5.
+
+## Training
+
+The model was trained using 2,750 steps, and 4 batch size.
+
+### Datasets
+
+The training corpus is made up of:
+
+- [HuggingFaceH4/no_robots](https://huggingface.co/datasets/HuggingFaceH4/no_robots)
+- [HuggingFaceTB/everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k)
+
+The `train` / `train_sft` splits were used.
+
+### Chat template
+
+The Zephyr chat template was used, but most notably, chat template tokens were added to enhance performance.
+
+## Limitations
+
+The model frequently outputs incorrect information, confirmation with a larger, mature model is advised. In addition, it may subtly repeat.