初始化项目,由ModelHub XC社区提供模型

Model: RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-11 19:18:04 +08:00
commit 93455e5a5d
21 changed files with 522 additions and 0 deletions

54
.gitattributes vendored Normal file
View File

@@ -0,0 +1,54 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.IQ4_NL.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:72ad47fd49cb4f43227d1f504d86a931345d2245f2c00e50b3bae74ec1ca96bf
size 6141602112

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0d84a0a9aa478f4be20d647abc0693b4c8cbc2d4d2c34c121ce1f0a706595789
size 5827651648

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b73f391555611e31e4f44b36ed55de2e47d856c1f35e6a7595932aab035a470f
size 4003242432

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5766e52f6b7f020b9b9679edec108189bd67539c087890b9ef3193e481c13b97
size 2483012864

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8c191608c7a46ccb772f841db2b4f4aeed641aeb6c583e9d8f193db62e996d5d
size 5650760960

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c3bc2cd072bef83af7672607713df6ffcfd8f41519fc09066a8496a98c1ac4a9
size 5195678976

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:892f9d76050b1c9edcecb2b84cbd9b332d0e2a8d0ec38d5b037051a6083874c3
size 4664575232

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0ab96bdb5aa43e08f7493a41384bb4f299511ce2ee51936ae54be91be1e93f79
size 6072396096

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:08b7d71409483b1a44bd34885e0b1e040a90e4ee166b51304f1752d0e94907ec
size 6734900032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0f1f08da772d5787c63ebcda888a17c2ba652c3a8d512ba56efa2dbc24a6e37e
size 6461679936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0f1f08da772d5787c63ebcda888a17c2ba652c3a8d512ba56efa2dbc24a6e37e
size 6461679936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:565305238960a5438355465a0e09449d155249075924b85bb39a43a237cc81bd
size 6118533440

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f69168b46ec7a267852054094614637fcf27712f9f48b20700ccd870cade46ef
size 7397403968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8e2bddd526f1b5279d7aa8a5e77defa24ccac281b44dd79f60d24f3c623d0372
size 8059907904

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8cbe2d1795a009bc3ea0b5fd3f6c90ad084bf9fb998cb399b4ac09afbfe5454a
size 7597944128

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8cbe2d1795a009bc3ea0b5fd3f6c90ad084bf9fb998cb399b4ac09afbfe5454a
size 7597944128

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a47a0ccb8f4b9d5649f4f8050d11e5ba929b0866aa28a9303542d4d23393aa63
size 7397403968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:41a43dc5c8372534019f7fafbd0ab53ec47eeb44fd1850bf3cecd12bd24dd96f
size 8805224832

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b159b74c780a6da513df33ad2a6e868b25f1600b450bbc2be94b6946fd830b20
size 11404173568

411
README.md Normal file
View File

@@ -0,0 +1,411 @@
Quantization made by Richard Erkhov.
[Github](https://github.com/RichardErkhov)
[Discord](https://discord.gg/pvy7H8DZMG)
[Request more models](https://github.com/RichardErkhov/quant_request)
OpenChat-3.5-0106_10.7B_48Layers-Interleaved - GGUF
- Model creator: https://huggingface.co/Pretergeek/
- Original model: https://huggingface.co/Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved/
| Name | Quant method | Size |
| ---- | ---- | ---- |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q2_K.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q2_K.gguf) | Q2_K | 3.73GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_S.gguf) | Q3_K_S | 4.34GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K.gguf) | Q3_K | 2.31GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_M.gguf) | Q3_K_M | 4.84GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q3_K_L.gguf) | Q3_K_L | 5.26GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.IQ4_XS.gguf) | IQ4_XS | 5.43GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_0.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_0.gguf) | Q4_0 | 5.66GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.IQ4_NL.gguf) | IQ4_NL | 5.72GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K_S.gguf) | Q4_K_S | 5.7GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K.gguf) | Q4_K | 6.02GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_K_M.gguf) | Q4_K_M | 6.02GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_1.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q4_1.gguf) | Q4_1 | 6.27GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_0.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_0.gguf) | Q5_0 | 6.89GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K_S.gguf) | Q5_K_S | 6.89GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K.gguf) | Q5_K | 7.08GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_K_M.gguf) | Q5_K_M | 7.08GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_1.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q5_1.gguf) | Q5_1 | 7.51GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q6_K.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q6_K.gguf) | Q6_K | 8.2GB |
| [OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q8_0.gguf](https://huggingface.co/RichardErkhov/Pretergeek_-_OpenChat-3.5-0106_10.7B_48Layers-Interleaved-gguf/blob/main/OpenChat-3.5-0106_10.7B_48Layers-Interleaved.Q8_0.gguf) | Q8_0 | 10.62GB |
Original model description:
---
license: apache-2.0
library_name: transformers
tags:
- mergekit
- merge
base_model:
- openchat/openchat-3.5-0106
model-index:
- name: OpenChat-3.5-0106_10.7B_48Layers-Interleaved
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 59.61
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 24.06
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 6.8
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 7.27
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 11.78
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 25.54
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
name: Open LLM Leaderboard
---
<p align="center">
<a href="https://ko-fi.com/pretergeek">Buy me a Ko-Fi</a>
<a href="https://patreon.com/Pretergeek">Support my work using Patreon</a>
</p>
# OpenChat-3.5-0106_10.7B_48Layers-Interleaved
This is NOT your usual frankenmerge created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the passthrough merge method, but employing the Block Expansion method described in the paper [LLaMA Pro: Progressive LLaMA with Block Expansion](https://arxiv.org/abs/2401.02415).
The authors of the paper added new layers interleaved in between the original layers of the model, setting the parameters of the o_proj and down_proj layers to zero. This effectively adds layers that will just output their input (as if they were "transparent") allowing the model to remain functional even without further training. These new layers can then be targeted during training or fine-tuning without risking catastrophic forgetting, if you follow the author's training method to freeze the original layers and only train the new layers.
This model has not yet received additional training, so it should perform close to the original model.
### Models Merged
The following models were included in the merge:
* [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
slices:
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [0, 2]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [1, 2]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [2, 4]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [3, 4]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [4, 6]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [5, 6]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [6, 8]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [7, 8]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [8, 10]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [9, 10]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [10, 12]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [11, 12]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [12, 14]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [13, 14]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [14, 16]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [15, 16]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [16, 18]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [17, 18]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [18, 20]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [19, 20]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [20, 22]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [21, 22]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [22, 24]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [23, 24]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [24, 26]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [25, 26]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [26, 28]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [27, 28]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [28, 30]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [29, 30]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [30, 32]
- sources:
- model: openchat/openchat-3.5-0106
layer_range: [31, 32]
parameters:
scale:
- filter: o_proj
value: 0.0
- filter: down_proj
value: 0.0
- value: 1.0
merge_method: passthrough
dtype: bfloat16
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Pretergeek__OpenChat-3.5-0106_10.7B_48Layers-Interleaved)
| Metric |Value|
|-------------------|----:|
|Avg. |22.51|
|IFEval (0-Shot) |59.61|
|BBH (3-Shot) |24.06|
|MATH Lvl 5 (4-Shot)| 6.80|
|GPQA (0-shot) | 7.27|
|MuSR (0-shot) |11.78|
|MMLU-PRO (5-shot) |25.54|
## Citation
```
@misc{wu2024llamaproprogressivellama,
title={LLaMA Pro: Progressive LLaMA with Block Expansion},
author={Chengyue Wu and Yukang Gan and Yixiao Ge and Zeyu Lu and Jiahao Wang and Ye Feng and Ying Shan and Ping Luo},
year={2024},
eprint={2401.02415},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2401.02415},
}
```