初始化项目,由ModelHub XC社区提供模型
Model: CalderaAI/13B-Ouroboros Source: Original Platform
This commit is contained in:
109
README.md
Normal file
109
README.md
Normal file
@@ -0,0 +1,109 @@
|
||||
---
|
||||
tags:
|
||||
- llama
|
||||
- alpaca
|
||||
- vicuna
|
||||
- uncensored
|
||||
- merge
|
||||
- mix
|
||||
- airoboros
|
||||
- openorca
|
||||
- orcamini
|
||||
- orca
|
||||
- instruct
|
||||
- mixtune
|
||||
datasets:
|
||||
- Open-Orca/OpenOrca
|
||||
- anon8231489123/ShareGPT_Vicuna_unfiltered
|
||||
- jondurbin/airoboros-uncensored
|
||||
language:
|
||||
- en
|
||||
metrics:
|
||||
- accuracy
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
## 13B-Ouroboros
|
||||
Ouroboros is an experimental model based on Meta's LLaMA [v1] 13B base model using a custom merging technique, tweaking
|
||||
each layer's merge % based on internal tests against the PTB dataset, scoring ~26.31 according to internal evaluation
|
||||
(6 samples, sequence length 1024; this testing is not empirical, it's a quick way to find near-optimum values). Testing,
|
||||
evaluating, and remixing this model is absolutely permissible and even encouraged (within the bounds of Meta's LLaMAv1
|
||||
license agreement); the more feedback the better we can tune our process! 😊
|
||||
|
||||
## Composition:
|
||||
Ouroboros is comprised of 40 layers [LLaMAv1 13B standard] mixed at optimized
|
||||
ratios VS the PTB dataset for lowest perplexity score. Listed below are the
|
||||
paired models and ratios merged per layer.
|
||||
|
||||
Tier One Merge:
|
||||
|
||||
13B-airoboros-gpt4-1.4 > 13B-orca_mini_v2
|
||||
|
||||
[0.22, 0.85, 0.89, 0.98, 0.3, 0.41, 0.71, 0.83, 0.32, 0.1, 0.44, 0.6, 0.53, 0.15, 0.86, 0.79, 0.93, 0.02, 0.19, 0.82, 0.01, 0.52, 0.07, 0.27, 0.73, 0.86, 0.08, 0.67, 0.42, 0.28, 0.37, 0.08, 0.95, 0.68, 0.45, 0.08, 0.7, 0.93, 0.96, 0.43]
|
||||
|
||||
13B-gpt4-x-alpaca > 13B-Vicuna-cocktail
|
||||
|
||||
[0.65, 0.94, 0.98, 0.87, 0.28, 0.64, 0.73, 0.7, 0.95, 0.89, 0.84, 0.9, 0.59, 0.92, 0.28, 0.61, 0.88, 0.73, 0.34, 0.85, 0.98, 0.05, 0.74, 0.92, 0.5, 0.78, 0.26, 0.4, 0.27, 0.65, 0.71, 0.7, 0.8, 0.93, 0.36, 0.03, 0.45, 0.39, 0.77, 0.06]
|
||||
|
||||
Tier Two Merge:
|
||||
|
||||
[13B-airoboros-gpt4-1.4 + 13B-orca_mini_v2] offspring > [13B-gpt4-x-alpaca + 13B-Vicuna-cocktail] offspring
|
||||
|
||||
[0.2, 0.83, 0.24, 0.03, 0.37, 0.62, 0.02, 0.82, 0.65, 0.63, 0.45, 0.65, 0.48, 0.45, 0.24, 0.76, 0.06, 0.31, 0.45, 0.86, 0.23, 0.99, 0.93, 0.84, 0.96, 0.53, 0.95, 0.32, 0.19, 0.06, 0.4, 0.08, 0.62, 0.4, 0.26, 0.12, 0.16, 0.91, 0.14, 0.0]
|
||||
|
||||
Result:
|
||||
|
||||
13B-Ouroboros, a model that seems uncensored and highly competent. So far only Alpaca instruction prompting has been tested and seems to work solidly well.
|
||||
|
||||
## Use:
|
||||
|
||||
Alpaca's instruct format can be used to do many things, including control of the terms of behavior
|
||||
between a user and a response from an agent in chat. Below is an example of a command injected into
|
||||
memory.
|
||||
|
||||
```
|
||||
### Instruction:
|
||||
Make Narrator function as a text based adventure game that responds with verbose, detailed, and creative descriptions of what happens next after Player's response.
|
||||
Make Player function as the player input for Narrator's text based adventure game, controlling a character named (insert character name here, their short bio, and
|
||||
whatever quest or other information to keep consistent in the interaction).
|
||||
|
||||
### Response:
|
||||
{an empty new line here}
|
||||
```
|
||||
|
||||
## Language Models Used Credits:
|
||||
|
||||
13B-airoboros-gpt4-1.4 by jondurbin
|
||||
|
||||
https://huggingface.co/jondurbin/airoboros-13b-gpt4-1.4
|
||||
|
||||
13B-orca_mini_v2 by psmathur
|
||||
|
||||
https://huggingface.co/psmathur/orca_mini_v2_13b
|
||||
|
||||
13B-gpt4-x-alpaca by chavinlo
|
||||
|
||||
https://huggingface.co/chavinlo/gpt4-x-alpaca
|
||||
|
||||
13B-Vicuna-cocktail by reeducator
|
||||
|
||||
https://huggingface.co/reeducator/vicuna-13b-cocktail
|
||||
|
||||
Also thanks to Meta for LLaMA.
|
||||
|
||||
Each model was hand picked and considered for what it could contribute to this ensemble.
|
||||
Thanks to each and every one of you for your incredible work developing some of the best things
|
||||
to come out of this community.
|
||||
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
||||
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_CalderaAI__13B-Ouroboros)
|
||||
|
||||
| Metric | Value |
|
||||
|-----------------------|---------------------------|
|
||||
| Avg. | 44.66 |
|
||||
| ARC (25-shot) | 57.42 |
|
||||
| HellaSwag (10-shot) | 82.11 |
|
||||
| MMLU (5-shot) | 51.43 |
|
||||
| TruthfulQA (0-shot) | 47.99 |
|
||||
| Winogrande (5-shot) | 57.85 |
|
||||
| GSM8K (5-shot) | 0.45 |
|
||||
| DROP (3-shot) | 15.36 |
|
||||
Reference in New Issue
Block a user