Files

ModelHub XC 7d0399742a 初始化项目，由ModelHub XC社区提供模型

Model: jan-hq/supermario-slerp-v3
Source: Original Platform

2026-05-06 08:51:46 +08:00

6.2 KiB

Raw Permalink Blame History

language, license, model-index

language

license

model-index

apache-2.0

name

results

supermario-slerp-v3

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

AI2 Reasoning Challenge (25-Shot)

ai2_arc

ARC-Challenge

test

num_few_shot
25

type	value	name
acc_norm	69.28	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jan-hq/supermario-slerp-v3	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

split

args

HellaSwag (10-Shot)

hellaswag

validation

num_few_shot
10

type	value	name
acc_norm	86.71	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jan-hq/supermario-slerp-v3	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU (5-Shot)

cais/mmlu

all

test

num_few_shot
5

type	value	name
acc	65.11	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jan-hq/supermario-slerp-v3	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

TruthfulQA (0-shot)

truthful_qa

multiple_choice

validation

num_few_shot
0

type	value
mc2	61.77

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jan-hq/supermario-slerp-v3	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

Winogrande (5-shot)

winogrande

winogrande_xl

validation

num_few_shot
5

type	value	name
acc	80.51	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jan-hq/supermario-slerp-v3	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

GSM8k (5-shot)

gsm8k

main

test

num_few_shot
5

type	value	name
acc	69.98	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jan-hq/supermario-slerp-v3	Open LLM Leaderboard

Jan - Discord

Model Description

This model uses the Slerp merge method from our 2 best models in 12th Dec:

base model: supermario-slerp-v2

The yaml config file for this model is here:

slices:
  - sources:
      - model: janhq/supermario-slerp-v2
        layer_range: [0, 32]
      - model: janhq/supermario-v2
        layer_range: [0, 32]
merge_method: slerp
base_model: janhq/supermario-slerp-v2
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

Run this model

You can run this model using Jan Desktop on Mac, Windows, or Linux.

Jan is an open source, ChatGPT alternative that is:

💻 100% offline on your machine: Your conversations remain confidential, and visible only to you.
🗂️ An Open File Format: Conversations and model settings stay on your computer and can be exported or deleted at any time.
🌐 OpenAI Compatible: Local server on port 1337 with OpenAI compatible endpoints
🌍 Open Source & Free: We build in public; check out our Github

About Jan

Jan believes in the need for an open-source AI ecosystem and is building the infra and tooling to allow open-source AIs to compete on a level playing field with proprietary ones.

Jan's long-term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life.

Jan Model Merger

This is a test project for merging models.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here.

Metric	Value
Avg.	?
ARC (25-shot)	?
HellaSwag (10-shot)	?
MMLU (5-shot)	?
TruthfulQA (0-shot)	?
Winogrande (5-shot)	?
GSM8K (5-shot)	?

Acknowlegement

SLERP

lm-evaluation-harness

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	72.22
AI2 Reasoning Challenge (25-Shot)	69.28
HellaSwag (10-Shot)	86.71
MMLU (5-Shot)	65.11
TruthfulQA (0-shot)	61.77
Winogrande (5-shot)	80.51
GSM8k (5-shot)	69.98

6.2 KiB Raw Permalink Blame History

Model Description

Run this model

About Jan

Jan Model Merger

Open LLM Leaderboard Evaluation Results

Acknowlegement

Open LLM Leaderboard Evaluation Results

6.2 KiB

Raw Permalink Blame History