language, license, tags, datasets, model-index
| language |
license |
tags |
datasets |
model-index |
|
|
cc-by-nc-nd-4.0 |
|
|
| name |
results |
| UNA-POLAR-10.7B-InstructMath-v2 |
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| AI2 Reasoning Challenge (25-Shot) |
ai2_arc |
ARC-Challenge |
test |
|
|
| type |
value |
name |
| acc_norm |
70.73 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
split |
args |
| HellaSwag (10-Shot) |
hellaswag |
validation |
|
|
| type |
value |
name |
| acc_norm |
88.2 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| MMLU (5-Shot) |
cais/mmlu |
all |
test |
|
|
| type |
value |
name |
| acc |
66.03 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| TruthfulQA (0-shot) |
truthful_qa |
multiple_choice |
validation |
|
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| Winogrande (5-shot) |
winogrande |
winogrande_xl |
validation |
|
|
| type |
value |
name |
| acc |
82.95 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| GSM8k (5-shot) |
gsm8k |
main |
test |
|
|
| type |
value |
name |
| acc |
64.75 |
accuracy |
|
|
|
|
|
|
|
UNA-POLAR-10.7B-InstructMath-v2
Model description
Its a UNA version with DPO over MathPILE Books out of the UNA-SOLAR-10.7B-Instruct-1.0
I used MathPILE OUTSTANDING Dataset of great Mathematic material in order to produce this beautiful model :)
Intended uses & limitations
If your model has inside UNA technology, cite.
Training and evaluation data
UNA-DPO over Attention and MLP's
Framework versions
- PEFT 0.7.1
- Transformers 4.36.2-UNA
- Pytorch 2.1.2+cu121
- Datasets 2.16.0
- Tokenizers 0.15.
Detailed results can be found here
| Metric |
Value |
| Avg. |
74.07 |
| AI2 Reasoning Challenge (25-Shot) |
70.73 |
| HellaSwag (10-Shot) |
88.20 |
| MMLU (5-Shot) |
66.03 |
| TruthfulQA (0-shot) |
71.73 |
| Winogrande (5-shot) |
82.95 |
| GSM8k (5-shot) |
64.75 |