Files

ModelHub XC e4eee317b1 初始化项目，由ModelHub XC社区提供模型

Model: nbeerbower/Qwen2.5-Gutenberg-Doppel-14B
Source: Original Platform

2026-05-21 00:11:59 +08:00

4.1 KiB

Raw Blame History

license, library_name, base_model, datasets, language, model-index

license

library_name

base_model

datasets

language

model-index

apache-2.0

transformers

Qwen/Qwen2.5-14B-Instruct

jondurbin/gutenberg-dpo-v0.1

nbeerbower/gutenberg2-dpo

zho

eng

fra

spa

por

deu

ita

rus

jpn

kor

vie

tha

ara

name

results

Qwen2.5-Gutenberg-Doppel-14B

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

IFEval (0-Shot)

HuggingFaceH4/ifeval

num_few_shot
0

type	value	name
inst_level_strict_acc and prompt_level_strict_acc	80.91	strict accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbeerbower/Qwen2.5-Gutenberg-Doppel-14B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

BBH (3-Shot)

BBH

num_few_shot
3

type	value	name
acc_norm	48.24	normalized accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbeerbower/Qwen2.5-Gutenberg-Doppel-14B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MATH Lvl 5 (4-Shot)

hendrycks/competition_math

num_few_shot
4

type	value	name
exact_match	0.0	exact match

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbeerbower/Qwen2.5-Gutenberg-Doppel-14B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

GPQA (0-shot)

Idavidrein/gpqa

num_few_shot
0

type	value	name
acc_norm	11.07	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbeerbower/Qwen2.5-Gutenberg-Doppel-14B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MuSR (0-shot)

TAUR-Lab/MuSR

num_few_shot
0

type	value	name
acc_norm	10.02	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbeerbower/Qwen2.5-Gutenberg-Doppel-14B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU-PRO (5-shot)

TIGER-Lab/MMLU-Pro

main

test

num_few_shot
5

type	value	name
acc	43.57	accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=nbeerbower/Qwen2.5-Gutenberg-Doppel-14B	Open LLM Leaderboard

Qwen2.5-Gutenberg-Doppel-14B

Qwen/Qwen2.5-14B-Instruct finetuned on jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo.

Method

ORPO tuned with 4x A40 for 3 epochs.

Thank you @ParasiticRogue for sponsoring.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	32.30
IFEval (0-Shot)	80.91
BBH (3-Shot)	48.24
MATH Lvl 5 (4-Shot)	0.00
GPQA (0-shot)	11.07
MuSR (0-shot)	10.02
MMLU-PRO (5-shot)	43.57

4.1 KiB Raw Blame History

Qwen2.5-Gutenberg-Doppel-14B

Method

Open LLM Leaderboard Evaluation Results

4.1 KiB

Raw Blame History