license, license_name, license_link, model-index
| license |
license_name |
license_link |
model-index |
| other |
qwen |
LICENSE |
| name |
results |
| Qwen-14B-Llamafied |
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| AI2 Reasoning Challenge (25-Shot) |
ai2_arc |
ARC-Challenge |
test |
|
|
| type |
value |
name |
| acc_norm |
55.2 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
split |
args |
| HellaSwag (10-Shot) |
hellaswag |
validation |
|
|
| type |
value |
name |
| acc_norm |
82.31 |
normalized accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| MMLU (5-Shot) |
cais/mmlu |
all |
test |
|
|
| type |
value |
name |
| acc |
66.11 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| TruthfulQA (0-shot) |
truthful_qa |
multiple_choice |
validation |
|
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| Winogrande (5-shot) |
winogrande |
winogrande_xl |
validation |
|
|
| type |
value |
name |
| acc |
76.56 |
accuracy |
|
|
|
|
| task |
dataset |
metrics |
source |
| type |
name |
| text-generation |
Text Generation |
|
| name |
type |
config |
split |
args |
| GSM8k (5-shot) |
gsm8k |
main |
test |
|
|
| type |
value |
name |
| acc |
52.77 |
accuracy |
|
|
|
|
|
|
|
This is a llamafied qwen-14b for compatibility with the llama software ecosystem.
I used this script to make the model and used the tokenizer of CausalLM, as suggested in the comments of the script.
https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py
Detailed results can be found here
| Metric |
Value |
| Avg. |
63.09 |
| AI2 Reasoning Challenge (25-Shot) |
55.20 |
| HellaSwag (10-Shot) |
82.31 |
| MMLU (5-Shot) |
66.11 |
| TruthfulQA (0-shot) |
45.60 |
| Winogrande (5-shot) |
76.56 |
| GSM8k (5-shot) |
52.77 |