初始化项目,由ModelHub XC社区提供模型

Model: louisbrulenaudet/Pearl-7B-slerp
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-31 12:22:12 +08:00
commit e5d21e6848
18 changed files with 337 additions and 0 deletions

58
.gitattributes vendored Normal file
View File

@@ -0,0 +1,58 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
model-00005-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
model-00007-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
model-00002-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
model-00006-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
model-00001-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
model-00008-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
model-00003-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
tokenizer.model filter=lfs diff=lfs merge=lfs -text
model-00004-of-00008.safetensors filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

142
README.md Normal file
View File

@@ -0,0 +1,142 @@
---
tags:
- merge
- mergekit
- Maths
- Mistral
base_model:
- mlabonne/OmniBeagle-7B
- WizardLM/WizardMath-7B-V1.1
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: Pearl-7B-slerp
results:
- task:
type: text-generation
metrics:
- name: Average
type: Average
value: 72.75
- name: ARC
type: ARC
value: 68.00
- name: GSM8K
type: GSM8K
value: 73.62
- name: Winogrande
type: Winogrande
value: 68.00
- name: TruthfulQA
type: TruthfulQA
value: 62.35
- name: HellaSwag
type: HellaSwag
value: 87.16
source:
name: Open LLM Leaderboard
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
---
<center><img src='https://i.imgur.com/0xFTuAX.png' width='450px'></center>
# Pearl-7B-slerp, an xtraordinary 7B model for maths
**03-22-2024 - To date, louisbrulenaudet/Pearl-34B-ties is the "Best 🤝 base merges and moerges model of around 30B" on the Open LLM Leaderboard.**
Pearl-7B-slerp is a merge of the following models:
* [mlabonne/OmniBeagle-7B](https://huggingface.co/mlabonne/OmniBeagle-7B)
* [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
### Evaluation
The evaluation was performed using the HuggingFace Open LLM Leaderboard.
| Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | Params (B) |
|-------------------------------------------|------------|-------|-----------|-------|------------|------------|-------|--------------|
| **louisbrulenaudet/Pearl-7B-slerp** |**72.75** | 68.00 | 87.16 | 64.04 | 62.35 | 81.29 |**73.62**| 7.24 |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 72.62 | 70.22 | 87.63 | 71.16 | 64.58 | 81.37 | 60.73 | 46.7 |
| microsoft/phi-2 | 61.33 | 61.09 | 75.11 | 58.11 | 44.47 | 74.35 | 54.81 | 2.78 |
| microsoft/Orca-2-13b | 58.64 | 60.67 | 79.81 | 60.37 | 56.41 | 76.64 | 17.97 | 13 |
| mistralai/Mistral-7B-Instruct-v0.1 | 54.96 | 54.52 | 75.63 | 55.38 | 56.28 | 73.72 | 14.25 | 7.24 |
| meta-llama/Llama-2-7b-hf | 50.97 | 53.07 | 78.59 | 46.87 | 38.76 | 74.03 | 14.48 | 6.74 |
Spherical Linear Interpolation (SLERP) serves as a technique for seamlessly interpolating between two vectors while maintaining a constant rate of change and upholding the geometric properties of the spherical space in which these vectors exist.
Opting for SLERP over traditional linear interpolation is motivated by various considerations. Linear interpolation in high-dimensional spaces may result in a reduction in the magnitude of the interpolated vector, diminishing the scale of weights. Additionally, in many cases, the alteration in the weights' direction conveys more meaningful information, such as feature learning and representation, compared to the magnitude of change.
$$ {\displaystyle \operatorname {slerp} (p_{0},p_{1};t)={\frac {\sin {[(1-t)\Omega }]}{\sin \Omega }}p_{0}+{\frac {\sin[t\Omega ]}{\sin \Omega }}p_{1}.}$$
The implementation of SLERP involves the following steps:
- Normalize the input vectors to unit length, ensuring they signify directions rather than magnitudes.
- Calculate the angle between these vectors using their dot product.
- If the vectors are nearly collinear, the method defaults to linear interpolation for efficiency. Otherwise, SLERP calculates scale factors based on the interpolation factor t (where t=0 corresponds to 100% of the first vector, and t=1 corresponds to 100% of the second vector) and the angle between the vectors.
- Utilize these computed factors to weigh the original vectors, and then sum them to derive the interpolated vector.
In essence, SLERP provides a robust mechanism for interpolating vectors, offering advantages in preserving directional information and mitigating issues associated with linear interpolation in high-dimensional spaces.
## Configuration
```yaml
slices:
- sources:
- model: mlabonne/OmniBeagle-7B
layer_range: [0, 32]
- model: WizardLM/WizardMath-7B-V1.1
layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/OmniBeagle-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
```
## Usage
```python
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "louisbrulenaudet/Pearl-7B-slerp"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
## Citing & Authors
If you use this code in your research, please use the following BibTeX entry.
```BibTeX
@misc{louisbrulenaudet2023,
author = {Louis Brulé Naudet},
title = {Pearl-7B-slerp, an xtraordinary 7B model for maths},
year = {2023}
howpublished = {\url{https://huggingface.co/louisbrulenaudet/Pearl-7B-slerp}},
}
```
## Feedback
If you have any feedback, please reach out at [louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com).

25
config.json Normal file
View File

@@ -0,0 +1,25 @@
{
"_name_or_path": "mlabonne/OmniBeagle-7B",
"architectures": [
"MistralForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"sliding_window": 4096,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.35.2",
"use_cache": true,
"vocab_size": 32000
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

17
mergekit_config.yml Normal file
View File

@@ -0,0 +1,17 @@
slices:
- sources:
- model: mlabonne/OmniBeagle-7B
layer_range: [0, 32]
- model: WizardLM/WizardMath-7B-V1.1
layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/OmniBeagle-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8c7de0e3b841f1a7dc5094086cb4aff27c227a0a94b8a36b4fb074b65b6294a3
size 1912681584

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6c3a1ab45b2f3872b49f5707d4005c0ebdf95147fe4335ddaa8634c55ce51711
size 1979781456

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ff34561db95ab77f54309734fdbcad73cef54d7dc82c63124464460f47709db4
size 1946243968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cbc1440d8325fb7efedbf3ebe85c30d6b299ee67bc50e28850cc2f9704e38c41
size 1979781416

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:874a2392c16eef3cd55ef9786dadc7f256ab504ee8d48b7183e00c7629106504
size 1862349080

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4782c99c7ddc5d3fa78a4a68648226ac13f1a926fcb227f70b3963372c7d82f3
size 1916866512

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ab362fbe94d94a9c40dedf6411290a381dc22d38c199e283cb1d252cf2d1bee6
size 1979781456

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:47b08baf485bdef3ff9a7255250b10cb5459193e0872a18dbac91bc7253f5902
size 906012472

File diff suppressed because one or more lines are too long

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

40
tokenizer_config.json Normal file
View File

@@ -0,0 +1,40 @@
{
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [],
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}