初始化项目,由ModelHub XC社区提供模型
Model: qikp/hummingbird-2-125m Source: Original Platform
This commit is contained in:
46
README.md
Normal file
46
README.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
license: mit
|
||||
datasets:
|
||||
- qikp/reborn-5k-no-thoughts
|
||||
- HuggingFaceTB/smol-smoltalk
|
||||
- HuggingFaceTB/everyday-conversations-llama3.1-2k
|
||||
language:
|
||||
- en
|
||||
base_model:
|
||||
- openai-community/gpt2
|
||||
pipeline_tag: text-generation
|
||||
library_name: transformers
|
||||
new_version: qikp/hummingbird-2.1-110m
|
||||
---
|
||||
|
||||
# Hummingbird
|
||||
|
||||
🎉 You are looking at Hummingbird 2, trained on a much more efficient corpus, achieving similar performance with 3x less parameters!
|
||||
|
||||
Hummingbird is a GPT-2 derivative trained to be conversational.
|
||||
|
||||
## Training
|
||||
|
||||
The model was trained using the `paged_adamw_8bit` optimizer, gradient checkpointing, 500 steps, 1 batch size, and 4 gradient accumulation steps.
|
||||
|
||||
### Datasets
|
||||
|
||||
The training corpus is made up of:
|
||||
|
||||
- First 1400 rows of [qikp/reborn-5k-no-thoughts](https://huggingface.co/datasets/qikp/reborn-5k-no-thoughts)
|
||||
- First 500 rows of [HuggingFaceTB/smol-smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smol-smoltalk)
|
||||
- First 100 rows of [HuggingFaceTB/everyday-conversations-llama3.1-2k](https://huggingface.co/datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k)
|
||||
|
||||
The `train` / `train_sft` splits were used.
|
||||
|
||||
### Chat template
|
||||
|
||||
The Zephyr chat template was used.
|
||||
|
||||
## Limitations
|
||||
|
||||
The model frequently outputs incorrect information, confirmation with a larger, mature model is advised.
|
||||
|
||||
## Benchmark
|
||||
|
||||
This model was tested against GAIA and compared using embeddings. See the results [here](https://codeberg.org/qikp/benchmarks).
|
||||
Reference in New Issue
Block a user