Files

ModelHub XC 34446516ca 初始化项目，由ModelHub XC社区提供模型

Model: giraffe176/WestMaid_HermesMonarchv0.1
Source: Original Platform

2026-04-17 03:09:49 +08:00

9.2 KiB

Raw Permalink Blame History

base_model, library_name, tags, license, model-index

base_model

library_name

tags

license

model-index

mistralai/Mistral-7B-v0.1

argilla/distilabeled-OpenHermes-2.5-Mistral-7B

NeverSleep/Noromaid-7B-0.4-DPO

senseable/WestLake-7B-v2

mlabonne/AlphaMonarch-7B

transformers

mergekit

merge

cc-by-nc-4.0

name

results

WestLake_Noromaid_OpenHermes_neural-chatv0.1

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

EQ-Bench

eq-bench

EQ-Bench

v2.1

num_few_shot
3

type	value	name
acc_norm	77.19	self-reported

url	name
https://github.com/EQ-bench/EQ-Bench	EQ-Bench v2.1

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

AI2 Reasoning Challenge (25-Shot)

ai2_arc

ARC-Challenge

test

num_few_shot
25

type	value	name
acc_norm	70.22	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

split

args

HellaSwag (10-Shot)

hellaswag

validation

num_few_shot
10

type	value	name
acc_norm	87.42	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU (5-Shot)

cais/mmlu

all

test

num_few_shot
5

type	value	name
acc	64.31	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

TruthfulQA (0-shot)

truthful_qa

multiple_choice

validation

num_few_shot
0

type	value
mc2	61.99

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

Winogrande (5-shot)

winogrande

winogrande_xl

validation

num_few_shot
5

type	value	name
acc	82.16	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

GSM8k (5-shot)

gsm8k

main

test

num_few_shot
5

type	value	name
acc	69.6	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1	Open LLM Leaderboard

WestMaid_HermesMonarchv0.1

This model benchmarks quite well compared to other 7b models, and has exceptional MT-Bench and EQ-Bench v2.1 scores, ranking higher than ChatGPT-3.5-turbo and Claude-1 in both tests, and Goliath-120b, and other 70B models in the latter .

This is a merge of pre-trained language models created using mergekit

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using mistralai/Mistral-7B-v0.1 as a base. Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers (guiding understanding, and thoughts), before Noromaid and AlphaMonarch come in to guide its wants, reasoning, and conversation.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistralai/Mistral-7B-v0.1
    # No parameters necessary for base model
  - model: senseable/WestLake-7B-v2
    parameters:
      density: 0.58
      weight: [0.50, 0.40, 0.25, 0.05]
  - model: NeverSleep/Noromaid-7B-0.4-DPO
    parameters:
      density: 0.58
      weight: [0.05, 0.05, 0.25, 0.40]
  - model: argilla/distilabeled-OpenHermes-2.5-Mistral-7B
    parameters:
      density: 0.58
      weight: [0.40, 0.50, 0.25, 0.05]
  - model: mlabonne/AlphaMonarch-7B
    parameters:
      density: 0.58
      weight: [0.05, 0.05, 0.25, 0.50]
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16

Benchmark Testing

MT-Bench

EQ-Bench Leaderboard

Table of Benchmarks

Open LLM Leaderboard

	Average	ARC	HellaSwag	MMLU	TruthfulQA	Winogrande	GSM8K
giraffe176/WestMaid_HermesMonarchv0.1	72.62	70.22	87.42	64.31	61.99	82.16	69.6
AlphaMonarch-7B	75.99	73.04	89.18	64.4	77.91	84.69	66.72
senseable/WestLake-7B-v2	74.68	73.04	88.65	64.71	67.06	86.98	67.63
teknium/OpenHermes-2.5-Mistral-7B	61.52	64.93	84.18	63.64	52.24	78.06	26.08
NeverSleep/Noromaid-7B-0.4-DPO	59.08	62.29	84.32	63.2	42.28	76.95	25.47

Yet Another LLM Leaderboard benchmarks

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
WestMaid_HermesMonarchv0.1	45.34	76.33	61.99	46.02	57.42

Misc. Benchmarks

	MT-Bench	EQ-Bench v2.1
giraffe176/WestMaid_HermesMonarchv0.1	8.021875	77.19 (3 Shot, ooba)
AlphaMonarch-7B	7.928125	76.08
senseable/WestLake-7B-v2		78.7
teknium/OpenHermes-2.5-Mistral-7B		66.89
claude-v1	7.900000	76.83
gpt-3.5-turbo	7.943750	71.74
	(Paper)	(Paper) Leaderboard

9.2 KiB Raw Permalink Blame History

WestMaid_HermesMonarchv0.1

Merge Details

Merge Method

Models Merged

Configuration

Benchmark Testing

MT-Bench

EQ-Bench Leaderboard

Table of Benchmarks

Open LLM Leaderboard

Yet Another LLM Leaderboard benchmarks

Misc. Benchmarks

9.2 KiB

Raw Permalink Blame History