license, library_name, base_model, datasets, language, model-index
| license |
library_name |
base_model |
datasets |
language |
model-index |
| apache-2.0 |
transformers |
| Qwen/Qwen2.5-14B-Instruct |
|
| jondurbin/gutenberg-dpo-v0.1 |
| nbeerbower/gutenberg2-dpo |
|
| zho |
| eng |
| fra |
| spa |
| por |
| deu |
| ita |
| rus |
| jpn |
| kor |
| vie |
| tha |
| ara |
|
| name |
results |
| Qwen2.5-Gutenberg-Doppel-14B |
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
args |
| IFEval (0-Shot) |
HuggingFaceH4/ifeval |
|
|
| type |
value |
name |
| inst_level_strict_acc and prompt_level_strict_acc |
80.91 |
strict accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
args |
| BBH (3-Shot) |
BBH |
|
|
| type |
value |
name |
| acc_norm |
48.24 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
args |
| MATH Lvl 5 (4-Shot) |
hendrycks/competition_math |
|
|
| type |
value |
name |
| exact_match |
0.0 |
exact match |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
args |
| GPQA (0-shot) |
Idavidrein/gpqa |
|
|
| type |
value |
name |
| acc_norm |
11.07 |
acc_norm |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
args |
| MuSR (0-shot) |
TAUR-Lab/MuSR |
|
|
| type |
value |
name |
| acc_norm |
10.02 |
acc_norm |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| MMLU-PRO (5-shot) |
TIGER-Lab/MMLU-Pro |
main |
test |
|
|
| type |
value |
name |
| acc |
43.57 |
accuracy |
|
|
|
|
|
|
|

Qwen2.5-Gutenberg-Doppel-14B
Qwen/Qwen2.5-14B-Instruct finetuned on jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo.
Method
ORPO tuned with 4x A40 for 3 epochs.
Thank you @ParasiticRogue for sponsoring.
Detailed results can be found here
| Metric |
Value |
| Avg. |
32.30 |
| IFEval (0-Shot) |
80.91 |
| BBH (3-Shot) |
48.24 |
| MATH Lvl 5 (4-Shot) |
0.00 |
| GPQA (0-shot) |
11.07 |
| MuSR (0-shot) |
10.02 |
| MMLU-PRO (5-shot) |
43.57 |