license, tags, model-index
license
tags
model-index
apache-2.0
name
results
sheared-plus-westlake-nearest-50_75p
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
AI2 Reasoning Challenge (25-Shot)
ai2_arc
ARC-Challenge
test
type
value
name
acc_norm
36.18
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
HellaSwag (10-Shot)
hellaswag
validation
type
value
name
acc_norm
57.54
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
MMLU (5-Shot)
cais/mmlu
all
test
type
value
name
acc
24.2
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
TruthfulQA (0-shot)
truthful_qa
multiple_choice
validation
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
Winogrande (5-shot)
winogrande
winogrande_xl
validation
type
value
name
acc
56.75
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
GSM8k (5-shot)
gsm8k
main
test
type
value
name
acc
0.0
accuracy
Another trial of merging models with different sizes, still under testing, should be more stable, but I have no ideia if it's improving or degrading the base model.
In this I changed something, to have more Westlake.
Recipe:
Detailed results can be found here
Metric
Value
Avg.
36.18
AI2 Reasoning Challenge (25-Shot)
36.18
HellaSwag (10-Shot)
57.54
MMLU (5-Shot)
24.20
TruthfulQA (0-shot)
42.39
Winogrande (5-shot)
56.75
GSM8k (5-shot)
0.00