language, license, tags, datasets, pipeline_tag, model-index
| language |
license |
tags |
datasets |
pipeline_tag |
model-index |
|
|
apache-2.0 |
| merge |
| dpo |
| conversation |
| text-generation-inference |
| Kukedlc/NeuTrixOmniBe-7B-model-remix |
|
| argilla/OpenHermes2.5-dpo-binarized-alpha |
|
text-generation |
| name |
results |
| dpo-binarized-NeutrixOmnibe-7B |
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| AI2 Reasoning Challenge (25-Shot) |
ai2_arc |
ARC-Challenge |
test |
|
|
| type |
value |
name |
| acc_norm |
72.78 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
split |
args |
| HellaSwag (10-Shot) |
hellaswag |
validation |
|
|
| type |
value |
name |
| acc_norm |
89.05 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| MMLU (5-Shot) |
cais/mmlu |
all |
test |
|
|
| type |
value |
name |
| acc |
64.6 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| TruthfulQA (0-shot) |
truthful_qa |
multiple_choice |
validation |
|
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| Winogrande (5-shot) |
winogrande |
winogrande_xl |
validation |
|
|
| type |
value |
name |
| acc |
85.08 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| GSM8k (5-shot) |
gsm8k |
main |
test |
|
|
| type |
value |
name |
| acc |
69.45 |
accuracy |
|
|
|
|
|
|
|
DPO Finetuned Kukedlc/NeuTrixOmniBe-7B-model-remix using argilla/OpenHermes2.5-dpo-binarized-alpha
argilla dpo binarized pairs is a dataset built on top of: https://huggingface.co/datasets/teknium/OpenHermes-2.5 using https://github.com/argilla-io/distilabel if interested.
Thx for the great data sources.
GGUF: https://huggingface.co/eren23/dpo-binarized-NeutrixOmnibe-7B-GGUF
Detailed results can be found here
| Metric |
Value |
| Avg. |
76.31 |
| AI2 Reasoning Challenge (25-Shot) |
72.78 |
| HellaSwag (10-Shot) |
89.05 |
| MMLU (5-Shot) |
64.60 |
| TruthfulQA (0-shot) |
76.90 |
| Winogrande (5-shot) |
85.08 |
| GSM8k (5-shot) |
69.45 |