初始化项目,由ModelHub XC社区提供模型

Model: MrRikyz/FusionPulse-24B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-10 19:34:55 +08:00
commit bd47d5e126
19 changed files with 10836 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

135
README.md Normal file
View File

@@ -0,0 +1,135 @@
---
base_model:
- TheDrummer/Magidonia-24B-v4.3
- Ateron/Sketch-Cydonia
- OddTheGreat/Rotor_24B_V.1
- DarkArtsForge/Magistaroth-24B-v1.1
- MrRikyz/Rei-Pulse-24B
- sophosympatheia/Magistry-24B-v1.0
- TheDrummer/Cydonia-24B-v4.3
library_name: transformers
tags:
- mergekit
- merge
- RP
- NSFW
- roleplay
---
<div style="width: 100%; text-align: center; margin-bottom: 20px;">
<h1 style="font-size: 2.5em; font-weight: bold; color: #4A90E2;">FusionPulse-24B</h1>
<hr style="border: 0; height: 1px; background: linear-gradient(to right, transparent, #4A90E2, transparent); margin: 20px 0;">
</div>
## 🌟 Overview
**FusionPulse-24B** is a merge built on top of `Magidonia-24B-v4.3`.
The model uses the **TIES** method
### 🧩 Models Merged
This model results from a merge between:
* **TheDrummer/Magidonia-24B-v4.3** (Base)
* **Ateron/Sketch-Cydonia**
* **OddTheGreat/Rotor_24B_V.1**
* **DarkArtsForge/Magistaroth-24B-v1.1**
* **MrRikyz/Rei-Pulse-24B**
* **sophosympatheia/Magistry-24B-v1.0**
* **TheDrummer/Cydonia-24B-v4.3**
---
## 🛠️ Merge Details
### Method: TIES
The merge was performed using `mergekit` with the following parameters:
- **Base Model:** TheDrummer/Magidonia-24B-v4.3
- **Dtype:** float32
- **ODtype** BFloat16
- **tokenizer source** base
### ⚙️ Configuration
<details>
<summary><b>View Full Mergekit YAML</b></summary>
```yaml
base_model: TheDrummer/Magidonia-24B-v4.3
dtype: float32
merge_method: ties
modules:
default:
slices:
- sources:
- layer_range: [0, 40]
model: Ateron/Sketch-Cydonia
parameters:
density: 0.55
weight: 0.18
- layer_range: [0, 40]
model: OddTheGreat/Rotor_24B_V.1
parameters:
density: 0.65
weight: 0.22
- layer_range: [0, 40]
model: DarkArtsForge/Magistaroth-24B-v1.1
parameters:
density: 0.7
weight: 0.27
- layer_range: [0, 40]
model: MrRikyz/Rei-Pulse-24B
parameters:
density: 0.6
weight: 0.19
- layer_range: [0, 40]
model: sophosympatheia/Magistry-24B-v1.0
parameters:
density: 0.44
weight: 0.23
- layer_range: [0, 40]
model: TheDrummer/Cydonia-24B-v4.3
parameters:
density: 0.25
weight: 0.18
- layer_range: [0, 40]
model: TheDrummer/Magidonia-24B-v4.3
base_model_alpha: 0.85
ties:
merge_strategy: sum
normalize: true
sparsity: 0.17
rescale: true
layer_wise:
- filter: "layers.0-8.*"
scale: 0.75
- filter: "layers.9-20.*"
scale: 1.05
- filter: "layers.21-31.*"
scale: 1.15
tensor_factors:
attention: 1.1
mlp: 1.2
post:
normalize: true
clamp: 2.5
out_dtype: bfloat16
tokenizer:
source: base
```
</details>
# ✨ Acknowledgements
Thanks to the authors of the original models for their incredible work:
- Ateron for `Sketch-Cydonia`
- OddTheGreat for `Rotor_24B_V.1`
- DarkArtsForge for `Magistaroth-24B-v1.1`
- sophosympatheia for `Magistry-24B-v1.0`
- TheDrummer for `Cydonia-24B-v4.3` and `Magidonia-24B-v4.3`

112
chat_template.jinja Normal file
View File

@@ -0,0 +1,112 @@
{%- set default_system_message = 'First draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.\n\nYour thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response. Use the same language as the input.[/THINK]Here, provide a self-contained response.' %}
{{- bos_token }}
{#- Extract system message if present -#}
{%- if messages[0]['role'] == 'system' %}
{%- if messages[0]['content'] is string %}
{%- set raw_system_message = messages[0]['content'] %}
{%- else %}
{%- set raw_system_message = messages[0]['content'][0]['text'] %}
{%- endif %}
{%- set loop_messages = messages[1:] %}
{%- else %}
{%- set raw_system_message = "" %}
{%- set loop_messages = messages %}
{%- endif %}
{#- Detect THINK flag by searching for exact phrase "/think" -#}
{%- if "/think" in raw_system_message %}
{%- set THINK = True %}
{%- else %}
{%- set THINK = False %}
{%- endif %}
{#- Apply logic depending on THINK flag -#}
{%- if THINK %}
{%- if raw_system_message|length > 0 %}
{%- set system_message = default_system_message + "\n\n" + raw_system_message %}
{%- else %}
{%- set system_message = default_system_message %}
{%- endif %}
{{- '[SYSTEM_PROMPT]' + system_message + '[/SYSTEM_PROMPT]' }}
{%- else %}
{%- if raw_system_message|length > 0 %}
{{- '[SYSTEM_PROMPT]' + raw_system_message + '[/SYSTEM_PROMPT]' }}
{%- endif %}
{%- endif %}
{#- Tool description appended ONLY to last user message. Edits made by Unsloth #}
{%- set tools_description = "" %}
{%- set has_tools = false %}
{%- if tools is defined and tools is not none and tools|length > 0 %}
{%- set has_tools = true %}
{%- set tools_description = "[AVAILABLE_TOOLS]" + (tools | tojson) + "[/AVAILABLE_TOOLS]" %}
{{- tools_description }}
{%- endif %}
{%- for message in loop_messages %}
{%- if message['role'] == 'user' %}
{%- if message['content'] is string %}
{{- '[INST]' + message['content'] + '[/INST]' }}
{%- else %}
{{- '[INST]' }}
{%- for block in message['content'] %}
{%- if block['type'] == 'text' %}
{%- if block['text'] is defined %}
{{- block['text'] }}
{%- else %}
{{- block['content'] }}
{%- endif %}
{%- elif block['type'] in ['image', 'image_url'] %}
{{- '[IMG]' }}
{%- else %}
{{- raise_exception('Only text and image blocks are supported in message content!') }}
{%- endif %}
{%- endfor %}
{{- '[/INST]' }}
{%- endif %}
{%- elif message['role'] == 'system' %}
{%- if message['content'] is string %}
{{- '[SYSTEM_PROMPT]' + message['content'] + '[/SYSTEM_PROMPT]' }}
{%- else %}
{{- '[SYSTEM_PROMPT]' + message['content'][0]['text'] + '[/SYSTEM_PROMPT]' }}
{%- endif %}
{%- elif message['role'] == 'assistant' %}
{%- if message['content'] is string %}
{{- message['content'] }}
{%- elif message['content'] is iterable %}
{{- message['content'][0]['text'] }}
{%- endif %}
{%- if message['tool_calls'] is defined and message['tool_calls'] is not none %}
{%- for tool in message['tool_calls'] %}
{%- set arguments = tool['function']['arguments'] %}
{%- if arguments is not string %}
{%- set arguments = arguments|tojson %}
{%- endif %}
{{- "[TOOL_CALLS]" + tool['function']['name'] + "[ARGS]" + arguments }}
{%- endfor %}
{%- endif %}
{{- eos_token }}
{%- elif message["role"] == "tool_results" or message["role"] == "tool" %}
{%- if message.content is defined and message.content.content is defined %}
{%- set content = message.content.content %}
{%- else %}
{%- set content = message.content %}
{%- endif %}
{{- "[TOOL_RESULTS]" + content|string + "[/TOOL_RESULTS]" }}
{%- else %}
{{- raise_exception('Only user, system, assistant and tool roles are supported!') }}
{%- endif %}
{%- endfor %}
{#- Licensed under the Apache License, Version 2.0 (the "License") #}

27
config.json Normal file
View File

@@ -0,0 +1,27 @@
{
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"dtype": "bfloat16",
"eos_token_id": 2,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 32768,
"max_position_embeddings": 131072,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 40,
"num_key_value_heads": 8,
"pad_token_id": 11,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"transformers_version": "4.57.3",
"use_cache": false,
"vocab_size": 131072
}

69
mergekit_config.yml Normal file
View File

@@ -0,0 +1,69 @@
base_model: TheDrummer/Magidonia-24B-v4.3
dtype: float32
merge_method: ties
modules:
default:
slices:
- sources:
- layer_range: [0, 40]
model: Ateron/Sketch-Cydonia
parameters:
density: 0.55
weight: 0.18
- layer_range: [0, 40]
model: OddTheGreat/Rotor_24B_V.1
parameters:
density: 0.65
weight: 0.22
- layer_range: [0, 40]
model: DarkArtsForge/Magistaroth-24B-v1.1
parameters:
density: 0.7
weight: 0.27
- layer_range: [0, 40]
model: MrRikyz/Rei-Pulse-24B
parameters:
density: 0.6
weight: 0.19
- layer_range: [0, 40]
model: sophosympatheia/Magistry-24B-v1.0
parameters:
density: 0.44
weight: 0.23
- layer_range: [0, 40]
model: TheDrummer/Cydonia-24B-v4.3
parameters:
density: 0.25
weight: 0.18
- layer_range: [0, 40]
model: TheDrummer/Magidonia-24B-v4.3
base_model_alpha: 0.85
ties:
merge_strategy: sum
normalize: true
sparsity: 0.17
rescale: true
layer_wise:
- filter: "layers.0-8.*"
scale: 0.75
- filter: "layers.9-20.*"
scale: 1.05
- filter: "layers.21-31.*"
scale: 1.15
tensor_factors:
attention: 1.1
mlp: 1.2
post:
normalize: true
clamp: 2.5
out_dtype: bfloat16
tokenizer:
source: base

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9af47b1e80f69f03b663cf2b699098d1f65e8c32c9ba0088fc2d6c37a75dbf91
size 4907389312

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e99fd584feec4cad7e4e96f59d34a3309959dc265f0e65c7769584fe573bc420
size 4781592832

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2363eb561d25a762f758cb616f305c37a30a8cdf9363bd91a9aff8cac8972c56
size 4781592816

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6b92a387cb99cb01f13620527e50f950edfd72e32283a6bd8deff0aaf37b9403
size 4886471592

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:850a488866829077b595a7b25f0c2fc29a3672c9ec2fcafb6c69984df14454a4
size 4781592832

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:837302dc5fd40e7fbbd8484675ee7ae697cd75a98a7f471633ba8ea12974e8b4
size 4781592816

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:91f054b29c710d3e94f4e0a6e4795a91a418d60b846b1abac5dc7f46d73feab8
size 4886471592

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:142b09c70d9bda3a0ee63d7a5a91b4638a698b2eefaec657f68cbca8897161a7
size 4781592832

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:160b02f7e0f9f8b32f7e6ad8e7803bd71542181700f79abc67350cdc57ebcd31
size 4781592800

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:322e8fd60e0ca1a8d36e8c6412d68f0eb9f22305737b9ea917d0434caaff3e11
size 3774959456

View File

@@ -0,0 +1,371 @@
{
"metadata": {
"total_size": 47144806400,
"mergekit_version": "0.1.4"
},
"weight_map": {
"lm_head.weight": "model-00001-of-00010.safetensors",
"model.embed_tokens.weight": "model-00001-of-00010.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00010.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00010.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00010.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00010.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
"model.layers.10.input_layernorm.weight": "model-00001-of-00010.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.12.input_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.13.input_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.14.input_layernorm.weight": "model-00002-of-00010.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.15.input_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.16.input_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.17.input_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.18.input_layernorm.weight": "model-00003-of-00010.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.19.input_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.2.input_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.20.input_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.21.input_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
"model.layers.22.input_layernorm.weight": "model-00004-of-00010.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.23.input_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.24.input_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.25.input_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.26.input_layernorm.weight": "model-00005-of-00010.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.27.input_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.28.input_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.29.input_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.3.input_layernorm.weight": "model-00006-of-00010.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.30.input_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.31.input_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.32.input_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.33.input_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
"model.layers.34.input_layernorm.weight": "model-00007-of-00010.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.35.input_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.36.input_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.37.input_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.38.input_layernorm.weight": "model-00008-of-00010.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.39.input_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.4.input_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.5.input_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.6.input_layernorm.weight": "model-00009-of-00010.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.7.input_layernorm.weight": "model-00010-of-00010.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.8.input_layernorm.weight": "model-00010-of-00010.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.9.input_layernorm.weight": "model-00010-of-00010.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
"model.norm.weight": "model-00010-of-00010.safetensors"
}
}

1032
special_tokens_map.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:158915c7c928301e0d32f1da363e45c294d907bd4c64a8e855f0bd1a47b9e870
size 17078001

9021
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff