初始化项目，由ModelHub XC社区提供模型

Model: giraffe176/WestMaid_HermesMonarchv0.1 Source: Original Platform
2026-04-17 03:09:49 +08:00
commit 34446516ca
11 changed files with 91511 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,229 @@
+---
+base_model:
+- mistralai/Mistral-7B-v0.1
+- argilla/distilabeled-OpenHermes-2.5-Mistral-7B
+- NeverSleep/Noromaid-7B-0.4-DPO
+- senseable/WestLake-7B-v2
+- mlabonne/AlphaMonarch-7B
+library_name: transformers
+tags:
+- mergekit
+- merge
+license: cc-by-nc-4.0
+model-index:
+- name: WestLake_Noromaid_OpenHermes_neural-chatv0.1
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: EQ-Bench
+      type: eq-bench
+      config: EQ-Bench
+      split: v2.1
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc_norm
+      value: 77.19
+      name: self-reported
+    source:
+      url: https://github.com/EQ-bench/EQ-Bench
+      name: EQ-Bench v2.1
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge (25-Shot)
+      type: ai2_arc
+      config: ARC-Challenge
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: acc_norm
+      value: 70.22
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HellaSwag (10-Shot)
+      type: hellaswag
+      split: validation
+      args:
+        num_few_shot: 10
+    metrics:
+    - type: acc_norm
+      value: 87.42
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU (5-Shot)
+      type: cais/mmlu
+      config: all
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 64.31
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: TruthfulQA (0-shot)
+      type: truthful_qa
+      config: multiple_choice
+      split: validation
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: mc2
+      value: 61.99
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Winogrande (5-shot)
+      type: winogrande
+      config: winogrande_xl
+      split: validation
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 82.16
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GSM8k (5-shot)
+      type: gsm8k
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 69.6
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/WestMaid_HermesMonarchv0.1
+      name: Open LLM Leaderboard
+---
+# WestMaid_HermesMonarchv0.1
+
+<img src="https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/YJTMJZF80hKaKnPDu_yMV.png" alt="drawing" width="800"/>
+
+This model benchmarks quite well compared to other 7b models, and has exceptional [MT-Bench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) and [EQ-Bench v2.1](https://github.com/EQ-bench/EQ-Bench) scores, ranking higher than ChatGPT-3.5-turbo and Claude-1 in both tests, and Goliath-120b, and other 70B models in the latter .
+
+This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit)
+
+## Merge Details
+### Merge Method
+
+This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
+Density was chosen deterministically between the models chosen for this merge. After testing many densities, I settled on 0.58 for each of the chosen models as it returned the highest EQ-Bench score. Not much testing was done with the weights, but I thought that I'd try gradients. Conceptually, Westlake and a Distilled version of Open Heremes are heavier in the initial layers (guiding understanding, and thoughts), before Noromaid and AlphaMonarch come in to guide its wants, reasoning, and conversation.
+
+
+
+### Models Merged
+
+The following models were included in the merge:
+* [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
+* [NeverSleep/Noromaid-7B-0.4-DPO](https://huggingface.co/NeverSleep/Noromaid-7B-0.4-DPO)
+* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
+* [argilla/distilabeled-OpenHermes-2.5-Mistral-7B](https://huggingface.co/argilla/distilabeled-OpenHermes-2.5-Mistral-7B)
+
+### Configuration
+
+The following YAML configuration was used to produce this model:
+
+```yaml
+models:
+  - model: mistralai/Mistral-7B-v0.1
+    # No parameters necessary for base model
+  - model: senseable/WestLake-7B-v2
+    parameters:
+      density: 0.58
+      weight: [0.50, 0.40, 0.25, 0.05]
+  - model: NeverSleep/Noromaid-7B-0.4-DPO
+    parameters:
+      density: 0.58
+      weight: [0.05, 0.05, 0.25, 0.40]
+  - model: argilla/distilabeled-OpenHermes-2.5-Mistral-7B
+    parameters:
+      density: 0.58
+      weight: [0.40, 0.50, 0.25, 0.05]
+  - model: mlabonne/AlphaMonarch-7B
+    parameters:
+      density: 0.58
+      weight: [0.05, 0.05, 0.25, 0.50]
+merge_method: dare_ties
+base_model: mistralai/Mistral-7B-v0.1
+parameters:
+  int8_mask: true
+dtype: bfloat16
+
+```
+## Benchmark Testing
+### MT-Bench 
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/H2BLoovTbLg8d8mtFSKYB.png)
+
+### EQ-Bench Leaderboard
+
+<img src="https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/0Z6AIhaqCiKREf0fQEVqr.png" alt="drawing" width="800"/>
+
+
+### Table of Benchmarks
+
+## Open LLM Leaderboard
+
+|                                                         | Average | ARC   | HellaSwag | MMLU  | TruthfulQA | Winogrande | GSM8K |
+|---------------------------------------------------------|---------|-------|-----------|-------|------------|------------|-------|
+| giraffe176/WestMaid_HermesMonarchv0.1                   | 72.62   | 70.22 | 87.42     | 64.31 | 61.99      | 82.16      | 69.6  |
+| AlphaMonarch-7B                                         | 75.99   | 73.04 | 89.18     | 64.4  | 77.91      | 84.69      | 66.72 |
+| senseable/WestLake-7B-v2                                | 74.68   | 73.04 | 88.65     | 64.71 | 67.06      | 86.98      | 67.63 |
+| teknium/OpenHermes-2.5-Mistral-7B                       | 61.52   | 64.93 | 84.18     | 63.64 | 52.24      | 78.06      | 26.08 |
+| NeverSleep/Noromaid-7B-0.4-DPO                          | 59.08   | 62.29 | 84.32     | 63.2  | 42.28      | 76.95      | 25.47 |
+
+
+
+## Yet Another LLM Leaderboard benchmarks
+
+|                                          Model                                           |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
+|------------------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
+|[WestMaid_HermesMonarchv0.1](https://huggingface.co/giraffe176/WestMaid_HermesMonarchv0.1)|  45.34|  76.33|     61.99|   46.02|  57.42|
+
+## Misc. Benchmarks
+
+|                                                         | MT-Bench                                    | EQ-Bench v2.1                                                                   |
+|---------------------------------------------------------|---------------------------------------------|---------------------------------------------------------------------------------|
+| giraffe176/WestMaid_HermesMonarchv0.1                   | 8.021875                                    | 77.19 (3 Shot, ooba)                                                            |
+| AlphaMonarch-7B                                         | 7.928125                                    | 76.08                                                                           |
+| senseable/WestLake-7B-v2                                |                                             | 78.7                                                                            |
+| teknium/OpenHermes-2.5-Mistral-7B                       |                                             | 66.89                                                                           |
+| claude-v1                                               | 7.900000                                    | 76.83                                                                           |
+| gpt-3.5-turbo                                           | 7.943750                                    | 71.74                                                                           |
+|                                                         | [(Paper)](https://arxiv.org/abs/2306.05685) | [(Paper)](https://arxiv.org/abs/2312.06281) [Leaderboard](https://eqbench.com/) |
+