初始化项目,由ModelHub XC社区提供模型

Model: Undi95/MLewd-ReMM-L2-Chat-20B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-08 02:07:08 +08:00
commit a9fe42c2a0
13 changed files with 93658 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

130
README.md Normal file
View File

@@ -0,0 +1,130 @@
---
license: cc-by-nc-4.0
tags:
- not-for-all-audiences
- nsfw
---
First :
```shell
layer_slices:
- model: Undi95/MLewd-L2-Chat-13B
start: 0
end: 16
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 8
end: 20
- model: Undi95/MLewd-L2-Chat-13B
start: 17
end: 32
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 21
end: 40
```
Inverted:
```shell
layer_slices:
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 0
end: 16
- model: Undi95/MLewd-L2-Chat-13B
start: 8
end: 20
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 17
end: 32
- model: Undi95/MLewd-L2-Chat-13B
start: 21
end: 40
```
Precise:
```shell
layer_slices:
- model: Undi95/MLewd-L2-Chat-13B
start: 0
end: 8
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 4
end: 12
- model: Undi95/MLewd-L2-Chat-13B
start: 9
end: 16
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 13
end: 22
- model: Undi95/MLewd-L2-Chat-13B
start: 17
end: 24
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 23
end: 32
- model: Undi95/MLewd-L2-Chat-13B
start: 25
end: 32
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 33
end: 40
```
PreciseInverted:
```shell
layer_slices:
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 0
end: 8
- model: Undi95/MLewd-L2-Chat-13B
start: 4
end: 12
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 9
end: 16
- model: Undi95/MLewd-L2-Chat-13B
start: 13
end: 22
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 17
end: 24
- model: Undi95/MLewd-L2-Chat-13B
start: 23
end: 32
- model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
start: 25
end: 32
- model: Undi95/MLewd-L2-Chat-13B
start: 33
end: 40
```
Part1 = ReMM v2.1 merged /w MLewd low weight to keep consistency. I call this "dilution" and result show consistency and coherency without repeat/loop beside the small amount of duplicated datas.
The goal is to find the best way to interlace layers the best way possible to have a sweetspot between 13B and +30B.
Normal/Inverted is by chunk of 16 layers and Precise/PreciseInverted is by chunk of 8 layers.
All the models are made of 64(+1) layers. Need testing.
## Prompt template: Alpaca
```
Below is an instruction that describes a task. Write a response that completes the request.
### Instruction:
{prompt}
### Response:
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Undi95__MLewd-ReMM-L2-Chat-20B)
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | 53.33 |
| ARC (25-shot) | 62.46 |
| HellaSwag (10-shot) | 85.62 |
| MMLU (5-shot) | 59.13 |
| TruthfulQA (0-shot) | 55.63 |
| Winogrande (5-shot) | 77.19 |
| GSM8K (5-shot) | 10.92 |
| DROP (3-shot) | 22.33 |

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "Undi95/MLewd-L2-Chat-13B",
"architectures": [
"LlamaForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13824,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 40,
"num_hidden_layers": 62,
"num_key_value_heads": 40,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.33.2",
"use_cache": true,
"vocab_size": 32000
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0eed3a372e11f225e959ed6f98e41fb5d3f539ae6cd0afc29849b3e87627ff48
size 9985398280

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8248722ec84c0889fb239917a6e3cee380a002cac71504f4f8da416fdbab0aa8
size 9956563000

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7d59662aed960916a584e0d077286d5ba513d92698eaf6ab8885f173018fda4a
size 9993273312

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6a67c9735625d38115bfebc80be8af32654aa478729de9d0ddfaf4aec38ffd94
size 9725876168

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:63c585b4b9e15ed700bf6b26188072ab269c3ce98dd193a080b69a4ca689cdcb
size 655360128

File diff suppressed because one or more lines are too long

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

93391
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
size 499723

34
tokenizer_config.json Normal file
View File

@@ -0,0 +1,34 @@
{
"bos_token": {
"__type": "AddedToken",
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"clean_up_tokenization_spaces": false,
"eos_token": {
"__type": "AddedToken",
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"padding_side": "right",
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": {
"__type": "AddedToken",
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"use_default_system_prompt": true
}