language, license, tags, model-index
| language |
license |
tags |
model-index |
|
|
apache-2.0 |
| safetensors |
| mixtral |
| not-for-all-audiences |
| nsfw |
|
| name |
results |
| InfinityKuno-2x7B |
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| AI2 Reasoning Challenge (25-Shot) |
ai2_arc |
ARC-Challenge |
test |
|
|
| type |
value |
name |
| acc_norm |
69.62 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
split |
args |
| HellaSwag (10-Shot) |
hellaswag |
validation |
|
|
| type |
value |
name |
| acc_norm |
87.44 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| MMLU (5-Shot) |
cais/mmlu |
all |
test |
|
|
| type |
value |
name |
| acc |
64.49 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| TruthfulQA (0-shot) |
truthful_qa |
multiple_choice |
validation |
|
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| Winogrande (5-shot) |
winogrande |
winogrande_xl |
validation |
|
|
| type |
value |
name |
| acc |
82.72 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| GSM8k (5-shot) |
gsm8k |
main |
test |
|
|
| type |
value |
name |
| acc |
66.34 |
accuracy |
|
|
|
|
|
|
|
InfinityKuno-2x7B

Experimental model from Endevor/InfinityRP-v1-7B and SanjiWatsuki/Kunoichi-DPO-v2-7B models. Merged to MoE model with 2x7B parameters.
Prompt format:
Alpaca, Extended Alpaca, Roleplay-Alpaca. (Use any Alpaca based prompt formatting and you should be fine.)
Switch: FP16 - GGUF
Detailed results can be found here
| Metric |
Value |
| Avg. |
72.32 |
| AI2 Reasoning Challenge (25-Shot) |
69.62 |
| HellaSwag (10-Shot) |
87.44 |
| MMLU (5-Shot) |
64.49 |
| TruthfulQA (0-shot) |
63.28 |
| Winogrande (5-shot) |
82.72 |
| GSM8k (5-shot) |
66.34 |