Files
Qwen-1_8b-EverythingLM/README.md
ModelHub XC b188a66a47 初始化项目,由ModelHub XC社区提供模型
Model: KnutJaegersberg/Qwen-1_8b-EverythingLM
Source: Original Platform
2026-06-10 22:41:23 +08:00

3.7 KiB

license, license_name, license_link, model-index
license license_name license_link model-index
other qwen LICENSE
name results
Qwen-1_8b-EverythingLM
task dataset metrics source
type name
text-generation Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot) ai2_arc ARC-Challenge test
num_few_shot
25
type value name
acc_norm 38.65 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=KnutJaegersberg/Qwen-1_8b-EverythingLM Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
HellaSwag (10-Shot) hellaswag validation
num_few_shot
10
type value name
acc_norm 62.66 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=KnutJaegersberg/Qwen-1_8b-EverythingLM Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU (5-Shot) cais/mmlu all test
num_few_shot
5
type value name
acc 44.94 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=KnutJaegersberg/Qwen-1_8b-EverythingLM Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
TruthfulQA (0-shot) truthful_qa multiple_choice validation
num_few_shot
0
type value
mc2 38.7
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=KnutJaegersberg/Qwen-1_8b-EverythingLM Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
Winogrande (5-shot) winogrande winogrande_xl validation
num_few_shot
5
type value name
acc 58.96 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=KnutJaegersberg/Qwen-1_8b-EverythingLM Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
GSM8k (5-shot) gsm8k main test
num_few_shot
5
type value name
acc 12.74 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=KnutJaegersberg/Qwen-1_8b-EverythingLM Open LLM Leaderboard

Their noncommercial license applies.

### System:
You are an AI assistant. User will give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.
### Instruction: 
How do you fine tune a large language model? 
### Response:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 42.77
AI2 Reasoning Challenge (25-Shot) 38.65
HellaSwag (10-Shot) 62.66
MMLU (5-Shot) 44.94
TruthfulQA (0-shot) 38.70
Winogrande (5-shot) 58.96
GSM8k (5-shot) 12.74