Files
LexGPT-V3/README.md
ModelHub XC 9f328daf40 初始化项目,由ModelHub XC社区提供模型
Model: lex-hue/LexGPT-V3
Source: Original Platform
2026-05-01 09:38:04 +08:00

3.9 KiB

language, license, model-index
language license model-index
en
de
mit
name results
LexGPT-V3
task dataset metrics source
type name
text-generation Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot) ai2_arc ARC-Challenge test
num_few_shot
25
type value name
acc_norm 66.47 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lex-hue/LexGPT-V3 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
HellaSwag (10-Shot) hellaswag validation
num_few_shot
10
type value name
acc_norm 85.91 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lex-hue/LexGPT-V3 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU (5-Shot) cais/mmlu all test
num_few_shot
5
type value name
acc 64.48 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lex-hue/LexGPT-V3 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
TruthfulQA (0-shot) truthful_qa multiple_choice validation
num_few_shot
0
type value
mc2 59.98
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lex-hue/LexGPT-V3 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
Winogrande (5-shot) winogrande winogrande_xl validation
num_few_shot
5
type value name
acc 78.53 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lex-hue/LexGPT-V3 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
GSM8k (5-shot) gsm8k main test
num_few_shot
5
type value name
acc 61.56 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lex-hue/LexGPT-V3 Open LLM Leaderboard

This Model was just an Test Train to see how our new Training Algorithm and Data does like.

Model is based on Mistral v0.1

As this was an test run, we just tested it and heres the Data, the model hasnt Improved any better.

Model Turn 1 Score Turn 2 Score Average Score
gpt-4 8.95625 9.025000 8.990625
gpt-3.5-turbo 8.075000 7.943750 7.943750
claude-v1 8.150000 7.900000 8.025000
LexGPT-V3 8.14375 7.719355 7.926667
vicuna-13b-v1.3 6.812500 5.962500 6.387500

Open-LLM Leaderboard Results: Results

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 69.49
AI2 Reasoning Challenge (25-Shot) 66.47
HellaSwag (10-Shot) 85.91
MMLU (5-Shot) 64.48
TruthfulQA (0-shot) 59.98
Winogrande (5-shot) 78.53
GSM8k (5-shot) 61.56