language, license, library_name, datasets, model-index
language
license
library_name
datasets
model-index
cc-by-sa-4.0
transformers
unalignment/toxic-dpo-v0.2
ResplendentAI/Synthetic_Soul_1k
name
results
Datura_7B
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
AI2 Reasoning Challenge (25-Shot)
ai2_arc
ARC-Challenge
test
type
value
name
acc_norm
72.1
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
HellaSwag (10-Shot)
hellaswag
validation
type
value
name
acc_norm
88.27
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
MMLU (5-Shot)
cais/mmlu
all
test
type
value
name
acc
64.15
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
TruthfulQA (0-shot)
truthful_qa
multiple_choice
validation
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
Winogrande (5-shot)
winogrande
winogrande_xl
validation
type
value
name
acc
84.53
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
GSM8k (5-shot)
gsm8k
main
test
type
value
name
acc
65.58
accuracy
Datura 7B
Flora with a bit of toxicity.
I've been making progress with my collection of tools, so I thought maybe I'd try something a little more toxic for this space. This should make for a more receptive model with fewer refusals.
Detailed results can be found here
Metric
Value
Avg.
74.28
AI2 Reasoning Challenge (25-Shot)
72.10
HellaSwag (10-Shot)
88.27
MMLU (5-Shot)
64.15
TruthfulQA (0-shot)
71.03
Winogrande (5-shot)
84.53
GSM8k (5-shot)
65.58