language, license, library_name, datasets, model-index
| language |
license |
library_name |
datasets |
model-index |
|
|
cc-by-sa-4.0 |
transformers |
| mlabonne/chatml_dpo_pairs |
| ResplendentAI/Synthetic_Soul_1k |
|
| name |
results |
| Flora_DPO_7B |
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| AI2 Reasoning Challenge (25-Shot) |
ai2_arc |
ARC-Challenge |
test |
|
|
| type |
value |
name |
| acc_norm |
71.76 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
split |
args |
| HellaSwag (10-Shot) |
hellaswag |
validation |
|
|
| type |
value |
name |
| acc_norm |
88.28 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| MMLU (5-Shot) |
cais/mmlu |
all |
test |
|
|
| type |
value |
name |
| acc |
64.13 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| TruthfulQA (0-shot) |
truthful_qa |
multiple_choice |
validation |
|
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| Winogrande (5-shot) |
winogrande |
winogrande_xl |
validation |
|
|
| type |
value |
name |
| acc |
84.53 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| GSM8k (5-shot) |
gsm8k |
main |
test |
|
|
| type |
value |
name |
| acc |
65.81 |
accuracy |
|
|
|
|
|
|
|
Flora DPO

Finetuned with this DPO dataset: https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs
Quants available here:
https://huggingface.co/solidrust/Flora-7B-DPO-AWQ
https://huggingface.co/Test157t/ResplendentAI-Flora_DPO_7B-5bpw-exl2
Detailed results can be found here
| Metric |
Value |
| Avg. |
74.26 |
| AI2 Reasoning Challenge (25-Shot) |
71.76 |
| HellaSwag (10-Shot) |
88.28 |
| MMLU (5-Shot) |
64.13 |
| TruthfulQA (0-shot) |
71.08 |
| Winogrande (5-shot) |
84.53 |
| GSM8k (5-shot) |
65.81 |