初始化项目,由ModelHub XC社区提供模型

Model: qikp/treeswift-90m
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-24 11:19:35 +08:00
commit 618ccf35f5
9 changed files with 250506 additions and 0 deletions

37
README.md Normal file
View File

@@ -0,0 +1,37 @@
---
license: apache-2.0
datasets:
- HuggingFaceH4/no_robots
- HuggingFaceTB/everyday-conversations-llama3.1-2k
language:
- en
base_model:
- distilbert/distilgpt2
pipeline_tag: text-generation
library_name: transformers
---
# Treeswift
Treeswift is a derivative of DistilGPT2 trained to be conversational. It is also designed to be similar to GPT-3.5.
## Training
The model was trained using 2,750 steps, and 4 batch size.
### Datasets
The training corpus is made up of:
- [HuggingFaceH4/no_robots](https://huggingface.co/datasets/HuggingFaceH4/no_robots)
- [HuggingFaceTB/everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k)
The `train` / `train_sft` splits were used.
### Chat template
The Zephyr chat template was used, but most notably, chat template tokens were added to enhance performance.
## Limitations
The model frequently outputs incorrect information, confirmation with a larger, mature model is advised. In addition, it may subtly repeat.