license, model-index, tags, base_model
license model-index tags base_model
apache-2.0
name results
OpenHermes-2.5-neural-chat-v3-3-Slerp
task dataset metrics
type name
text-generation Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot) ai2_arc ARC-Challenge test
num_few_shot
25
type value name
acc_norm 68.09 normalized accuracy
task dataset metrics
type name
text-generation Text Generation
name type split args
HellaSwag (10-Shot) hellaswag validation
num_few_shot
10
type value name
acc_norm 86.2 normalized accuracy
task dataset metrics
type name
text-generation Text Generation
name type config split args
MMLU (5-Shot) cais/mmlu all test
num_few_shot
5
type value name
acc 64.26 accuracy
task dataset metrics
type name
text-generation Text Generation
name type config split args
TruthfulQA (0-shot) truthful_qa multiple_choice validation
num_few_shot
0
type value
mc2 62.78
task dataset metrics
type name
text-generation Text Generation
name type config split args
Winogrande (5-shot) winogrande winogrande_xl validation
num_few_shot
5
type value name
acc 79.16 accuracy
task dataset metrics
type name
text-generation Text Generation
name type config split args
GSM8k (5-shot) gsm8k main test
num_few_shot
5
type value name
acc 67.78 accuracy
merge
teknium/OpenHermes-2.5-Mistral-7B
Intel/neural-chat-7b-v3-3

image/png

OpenHermes-2.5-neural-chat-v3-3-Slerp

This is the model for OpenHermes-2.5-neural-chat-v3-3-Slerp. I used mergekit to merge models.

Prompt Templates

You can use these prompt templates, but I recommend using ChatML.

ChatML (OpenHermes-2.5-Mistral-7B):

<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
{asistant}<|im_end|>

neural-chat-7b-v3-3:

### System:
{system}
### User:
{user}
### Assistant:

Yaml Config to reproduce

slices:
  - sources:
      - model: teknium/OpenHermes-2.5-Mistral-7B
        layer_range: [0, 32]
      - model: Intel/neural-chat-7b-v3-3
        layer_range: [0, 32]
merge_method: slerp
base_model: mistralai/Mistral-7B-v0.1
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: bfloat16

Quantizationed versions

Quantizationed versions of this model is available thanks to TheBloke.

GPTQ
GGUF
AWQ

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 71.38
ARC (25-shot) 68.09
HellaSwag (10-shot) 86.2
MMLU (5-shot) 64.26
TruthfulQA (0-shot) 62.78
Winogrande (5-shot) 79.16
GSM8K (5-shot) 67.78

If you would like to support me:

Buy Me a Coffee

Description
Model synced from source: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
Readme 2.2 MiB