license, model-index
license model-index
cc-by-4.0
name results
Dionysus-Mistral-m3-v5
task dataset metrics source
type name
text-generation Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot) ai2_arc ARC-Challenge test
num_few_shot
25
type value name
acc_norm 59.56 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=decem/Dionysus-Mistral-m3-v5 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
HellaSwag (10-Shot) hellaswag validation
num_few_shot
10
type value name
acc_norm 80.99 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=decem/Dionysus-Mistral-m3-v5 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU (5-Shot) cais/mmlu all test
num_few_shot
5
type value name
acc 61.18 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=decem/Dionysus-Mistral-m3-v5 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
TruthfulQA (0-shot) truthful_qa multiple_choice validation
num_few_shot
0
type value
mc2 50.93
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=decem/Dionysus-Mistral-m3-v5 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
Winogrande (5-shot) winogrande winogrande_xl validation
num_few_shot
5
type value name
acc 75.14 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=decem/Dionysus-Mistral-m3-v5 Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
GSM8k (5-shot) gsm8k main test
num_few_shot
5
type value name
acc 51.02 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=decem/Dionysus-Mistral-m3-v5 Open LLM Leaderboard

decem/Dionysus-Mistral-m3-v5 - A Fine-tuned Language Model

Model Details

  • Developer: DECEM
  • Fine-tuning Method: SFT
  • Language: English

Prompting

Prompt Template for alpaca style

### Instruction:

<prompt> (without the <>)

### Response:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 63.14
AI2 Reasoning Challenge (25-Shot) 59.56
HellaSwag (10-Shot) 80.99
MMLU (5-Shot) 61.18
TruthfulQA (0-shot) 50.93
Winogrande (5-shot) 75.14
GSM8k (5-shot) 51.02
Description
Model synced from source: decem/Dionysus-Mistral-m3-v5
Readme 1 MiB