license, tags, model-index
license
tags
model-index
apache-2.0
name
results
sheared-plus-westlake-normal
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
AI2 Reasoning Challenge (25-Shot)
ai2_arc
ARC-Challenge
test
type
value
name
acc_norm
39.76
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
HellaSwag (10-Shot)
hellaswag
validation
type
value
name
acc_norm
70.33
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
MMLU (5-Shot)
cais/mmlu
all
test
type
value
name
acc
26.81
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
TruthfulQA (0-shot)
truthful_qa
multiple_choice
validation
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
Winogrande (5-shot)
winogrande
winogrande_xl
validation
type
value
name
acc
63.54
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
GSM8k (5-shot)
gsm8k
main
test
type
value
name
acc
0.0
accuracy
Another trial of merging models with different sizes, still under testing, should be more stable, but I have no ideia if it's improving or degrading the base model.
Recipe:
Detailed results can be found here
Metric
Value
Avg.
41.16
AI2 Reasoning Challenge (25-Shot)
39.76
HellaSwag (10-Shot)
70.33
MMLU (5-Shot)
26.81
TruthfulQA (0-shot)
46.50
Winogrande (5-shot)
63.54
GSM8k (5-shot)
0.00