base_model, library_name, tags, license, language, model-index
base_model
library_name
tags
license
language
model-index
arcee-ai/Llama-3.1-SuperNova-Lite
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
FuseAI/FuseChat-Llama-3.1-8B-Instruct
transformers
llama3.1
name
results
Llama3.1-SuperDeepFuse
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
IFEval (0-Shot)
wis-k/instruction-following-eval
train
type
value
name
inst_level_strict_acc and prompt_level_strict_acc
77.62
averaged accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
BBH (3-Shot)
SaylorTwift/bbh
test
type
value
name
acc_norm
29.22
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
MATH Lvl 5 (4-Shot)
lighteval/MATH-Hard
test
type
value
name
exact_match
17.75
exact match
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
GPQA (0-shot)
Idavidrein/gpqa
train
type
value
name
acc_norm
3.24
acc_norm
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
args
MuSR (0-shot)
TAUR-Lab/MuSR
type
value
name
acc_norm
5.13
acc_norm
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
MMLU-PRO (5-shot)
TIGER-Lab/MMLU-Pro
main
test
type
value
name
acc
30.83
accuracy
Llama3.1-SuperDeepFuse
An 8B parameter language model that merges three high-performance distilled models to boost reasoning, instruction-following, and performance in mathematics and coding.
Model Highlights
Key Capabilities
Enhanced multi-task reasoning
Improved mathematical and coding performance
Multilingual support
Performance Notes
Maintains Llama 3.1 safety standards
Suitable for consumer GPU deployment
Balanced performance across diverse tasks
Considerations
Still being benchmarked
Capabilities limited compared to larger model variants
Can give misleading output like all other language models
Outputs should be independently verified
Licensing
Follows standard Llama 3.1 usage terms.
Detailed results can be found here !
Summarized results can be found here !
Metric
Value (%)
Average
27.30
IFEval (0-Shot)
77.62
BBH (3-Shot)
29.22
MATH Lvl 5 (4-Shot)
17.75
GPQA (0-shot)
3.24
MuSR (0-shot)
5.13
MMLU-PRO (5-shot)
30.83