初始化项目,由ModelHub XC社区提供模型

Model: sethuiyer/OpenDolphinHermes_Llama2_7B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-15 07:05:17 +08:00
commit 7a55af502c
17 changed files with 93797 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

229
README.md Normal file
View File

@@ -0,0 +1,229 @@
---
language:
- en
license: llama2
library_name: transformers
tags:
- merge
- mergekit
- lazymergekit
datasets:
- teknium/openhermes
- cognitivecomputations/dolphin
base_model:
- cognitivecomputations/dolphin-llama2-7b
- Tensoic/Llama-2-openhermes
pipeline_tag: text-generation
model-index:
- name: OpenDolphinHermes_Llama2_7B
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 55.03
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/OpenDolphinHermes_Llama2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 78.74
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/OpenDolphinHermes_Llama2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 52.25
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/OpenDolphinHermes_Llama2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 46.1
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/OpenDolphinHermes_Llama2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 73.16
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/OpenDolphinHermes_Llama2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 20.17
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/OpenDolphinHermes_Llama2_7B
name: Open LLM Leaderboard
---
# OpenDolphinHermes_Llama2_7B
<p align="center">
<img src="https://huggingface.co/sethuiyer/OpenDolphinHermes_Llama2_7B/resolve/main/dolphin_hermes.webp" height="256px" alt="SynthIQ">
</p>
mergekit SLERP of these two models
* [cognitivecomputations/dolphin-llama2-7b](https://huggingface.co/cognitivecomputations/dolphin-llama2-7b)
* [Tensoic/Llama-2-openhermes](https://huggingface.co/Tensoic/Llama-2-openhermes)
## 🧩 Configuration
```yaml
slices:
- sources:
- model: cognitivecomputations/dolphin-llama2-7b
layer_range: [0, 32]
- model: Tensoic/Llama-2-openhermes
layer_range: [0, 32]
merge_method: slerp
base_model: Tensoic/Llama-2-openhermes
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
```
# Prompt Template (ChatML)
```text
<|im_start|>system
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct.
If you don't know the answer to a question, please don't share false information.
<|im_end|>
<|im_start|>user
{ .Prompt}
<|im_end|>
<|im_start|>assistant
```
# OpenLLM Leaderboard
| T | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
|---|--------------------------------------------|---------|------|-----------|-------|------------|------------|-------|
| 0 | meta-llama/llama-2-13b-hf | 55.69 | 59.39 | 82.13 | 55.77 | 37.38 | 76.64 | 22.82 |
| 1 | sethuiyer/OpenDolphinHermes_Llama2_7B | 54.24 | 55.03| 78.74 | 52.25 | 46.1 | 73.16 | 20.17 |
| 2 | togethercomputer/Llama-2-7B-32K-Instruct | 50.02 | 51.11| 78.51 | 46.11 | 44.86 | 73.88 | 5.69 |
| 3 | togethercomputer/LLaMa-2-7B-32K | 47.07 | 47.53| 76.14 | 43.33 | 39.23 | 71.9 | 4.32 |
## Why?
I wanted a LLaMa2-7B model which is as good as base LLaMa2-13B model.
## 💻 Usage
```python
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "sethuiyer/OpenDolphinHermes_Llama2_7B"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
Output:
```text
A large language model is a type of artificial intelligence system that has been trained on a massive amount of data, often millions or even billions of words, to learn the patterns and relationships between words and phrases.
These models can then be used to generate new text, understand and translate languages, and perform various natural language processing tasks.
They have become increasingly popular in recent years due to advances in machine learning technology and their ability to achieve high levels of accuracy and performance on natural language processing tasks.
Examples of large language models include GPT-2, BERT, and T5.
```
## Thanks
Thanks to Google Colab for the compute.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_sethuiyer__OpenDolphinHermes_Llama2_7B)
| Metric |Value|
|---------------------------------|----:|
|Avg. |54.24|
|AI2 Reasoning Challenge (25-Shot)|55.03|
|HellaSwag (10-Shot) |78.74|
|MMLU (5-Shot) |52.25|
|TruthfulQA (0-shot) |46.10|
|Winogrande (5-shot) |73.16|
|GSM8k (5-shot) |20.17|

27
config.json Normal file
View File

@@ -0,0 +1,27 @@
{
"_name_or_path": "Tensoic/Llama-2-openhermes",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 32,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.35.2",
"use_cache": false,
"vocab_size": 32000
}

BIN
dolphin_hermes.webp Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 169 KiB

17
mergekit_config.yml Normal file
View File

@@ -0,0 +1,17 @@
slices:
- sources:
- model: cognitivecomputations/dolphin-llama2-7b
layer_range: [0, 32]
- model: Tensoic/Llama-2-openhermes
layer_range: [0, 32]
merge_method: slerp
base_model: Tensoic/Llama-2-openhermes
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9d7a337902882fe8bbbb99887dd4bea3cbcc272888e8c21d1a301a9ca3940336
size 1933652888

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:872d7cfca09c98920e0488078145ad323847d9cc2d9da27628579a2f7261b893
size 1933661192

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4b6bd3afeaed202d5e8fe4ca046fe3c41f70277759c9efe58df230b4eb4372c9
size 1933661176

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c07e8cc672de9cbb1e88d1cca6a54ad784001a282a6149142c112ab8dfe1d8fd
size 1990275936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d0ed06ac8bbfcf679c5974a23cc07040e29d3d800ea4ac459aa9340448b0446c
size 1923175144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1974e89153f4785fa1dd07fcf5fd29ad379b45aeb9738e3718a0dba063e081f4
size 1963003976

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:88f02c4da6533cf0b8e1b39ef03476f258737149ae90335868d05f6617813e52
size 1799434672

File diff suppressed because one or more lines are too long

30
special_tokens_map.json Normal file
View File

@@ -0,0 +1,30 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

93391
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

43
tokenizer_config.json Normal file
View File

@@ -0,0 +1,43 @@
{
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"bos_token": "<s>",
"chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "</s>",
"padding_side": "right",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"trust_remote_code": false,
"unk_token": "<unk>",
"use_default_system_prompt": true,
"use_fast": true
}