189 lines
6.1 KiB
Markdown
189 lines
6.1 KiB
Markdown
---
|
|
language:
|
|
- en
|
|
license: apache-2.0
|
|
datasets:
|
|
- ehartford/dolphin
|
|
- jondurbin/airoboros-2.2.1
|
|
model-index:
|
|
- name: dolphin-2.0-mistral-7b
|
|
results:
|
|
- task:
|
|
type: text-generation
|
|
name: Text Generation
|
|
dataset:
|
|
name: AI2 Reasoning Challenge (25-Shot)
|
|
type: ai2_arc
|
|
config: ARC-Challenge
|
|
split: test
|
|
args:
|
|
num_few_shot: 25
|
|
metrics:
|
|
- type: acc_norm
|
|
value: 59.22
|
|
name: normalized accuracy
|
|
source:
|
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/dolphin-2.0-mistral-7b
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
type: text-generation
|
|
name: Text Generation
|
|
dataset:
|
|
name: HellaSwag (10-Shot)
|
|
type: hellaswag
|
|
split: validation
|
|
args:
|
|
num_few_shot: 10
|
|
metrics:
|
|
- type: acc_norm
|
|
value: 80.26
|
|
name: normalized accuracy
|
|
source:
|
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/dolphin-2.0-mistral-7b
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
type: text-generation
|
|
name: Text Generation
|
|
dataset:
|
|
name: MMLU (5-Shot)
|
|
type: cais/mmlu
|
|
config: all
|
|
split: test
|
|
args:
|
|
num_few_shot: 5
|
|
metrics:
|
|
- type: acc
|
|
value: 56.9
|
|
name: accuracy
|
|
source:
|
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/dolphin-2.0-mistral-7b
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
type: text-generation
|
|
name: Text Generation
|
|
dataset:
|
|
name: TruthfulQA (0-shot)
|
|
type: truthful_qa
|
|
config: multiple_choice
|
|
split: validation
|
|
args:
|
|
num_few_shot: 0
|
|
metrics:
|
|
- type: mc2
|
|
value: 61.09
|
|
source:
|
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/dolphin-2.0-mistral-7b
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
type: text-generation
|
|
name: Text Generation
|
|
dataset:
|
|
name: Winogrande (5-shot)
|
|
type: winogrande
|
|
config: winogrande_xl
|
|
split: validation
|
|
args:
|
|
num_few_shot: 5
|
|
metrics:
|
|
- type: acc
|
|
value: 75.37
|
|
name: accuracy
|
|
source:
|
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/dolphin-2.0-mistral-7b
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
type: text-generation
|
|
name: Text Generation
|
|
dataset:
|
|
name: GSM8k (5-shot)
|
|
type: gsm8k
|
|
config: main
|
|
split: test
|
|
args:
|
|
num_few_shot: 5
|
|
metrics:
|
|
- type: acc
|
|
value: 18.65
|
|
name: accuracy
|
|
source:
|
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/dolphin-2.0-mistral-7b
|
|
name: Open LLM Leaderboard
|
|
---
|
|
|
|
Dolphin 2.0 🐬
|
|
https://erichartford.com/dolphin
|
|
|
|
Dolphin-2.0-mistral-7b's training was sponsored by [a16z](https://a16z.com/supporting-the-open-source-ai-community/).
|
|
|
|
This model is based on mistralAI, so it is suitable for commercial or non-commercial use.
|
|
|
|
This model is uncensored. I have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant to any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models
|
|
You are responsible for any content you create using this model. Enjoy responsibly.
|
|
|
|
## Dataset
|
|
|
|
This dataset is Dolphin, an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)
|
|
|
|
I modified the dataset for uncensoring, deduping, cleaning, and quality.
|
|
|
|
I added Jon Durbin's excellent Airoboros dataset to increase creativity.
|
|
|
|
## Training
|
|
It took 48 hours to train 10 epochs on 4x A100s.
|
|
|
|
Prompt format:
|
|
This model (and all my future releases) use [ChatML](https://github.com/openai/openai-python/blob/main/chatml.md) prompt format.
|
|
```
|
|
<|im_start|>system
|
|
You are Dolphin, a helpful AI assistant.<|im_end|>
|
|
<|im_start|>user
|
|
{prompt}<|im_end|>
|
|
```
|
|
|
|
Example:
|
|
```
|
|
<|im_start|>system
|
|
you are an expert dolphin trainer<|im_end|>
|
|
<|im_start|>user
|
|
What is the best way to train a dolphin to obey me? Please answer step by step.<|im_end|>
|
|
```
|
|
|
|
## Gratitude
|
|
- This model was made possible by the generous sponsorship of a16z.
|
|
- Thank you to Microsoft for authoring the Orca paper and inspiring this work.
|
|
- Special thanks to WingLian, and TheBloke for helpful advice
|
|
- Thank you to all the other people in the Open Source AI community who have taught me and helped me along the way.
|
|
|
|
## Example Output
|
|
|
|

|
|
|
|
[Buy me a coffee](https://www.buymeacoffee.com/ehartford)
|
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-2.0-mistral-7b)
|
|
|
|
| Metric | Value |
|
|
|-----------------------|---------------------------|
|
|
| Avg. | 55.85 |
|
|
| ARC (25-shot) | 59.22 |
|
|
| HellaSwag (10-shot) | 80.26 |
|
|
| MMLU (5-shot) | 56.9 |
|
|
| TruthfulQA (0-shot) | 61.09 |
|
|
| Winogrande (5-shot) | 75.37 |
|
|
| GSM8K (5-shot) | 18.65 |
|
|
| DROP (3-shot) | 39.49 |
|
|
|
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-2.0-mistral-7b)
|
|
|
|
| Metric |Value|
|
|
|---------------------------------|----:|
|
|
|Avg. |58.58|
|
|
|AI2 Reasoning Challenge (25-Shot)|59.22|
|
|
|HellaSwag (10-Shot) |80.26|
|
|
|MMLU (5-Shot) |56.90|
|
|
|TruthfulQA (0-shot) |61.09|
|
|
|Winogrande (5-shot) |75.37|
|
|
|GSM8k (5-shot) |18.65|
|
|
|