初始化项目,由ModelHub XC社区提供模型
Model: ShyliaSafetensors/NeutralWeirdness-V1-24B-Heretic Source: Original Platform
This commit is contained in:
37
.gitattributes
vendored
Normal file
37
.gitattributes
vendored
Normal file
@@ -0,0 +1,37 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
yatta.gif filter=lfs diff=lfs merge=lfs -text
|
||||
66
README.md
Normal file
66
README.md
Normal file
@@ -0,0 +1,66 @@
|
||||
---
|
||||
base_model:
|
||||
- FlareRebellion/WeirdCompound-v1.7-24b
|
||||
- OddTheGreat/NeutralGear_24B_V.2
|
||||
tags:
|
||||
- mistral
|
||||
- 24b
|
||||
- merge
|
||||
- heretic
|
||||
- abliterated
|
||||
- uncensored
|
||||
- roleplay
|
||||
- conversational
|
||||
license: other
|
||||
language:
|
||||
- en
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# NeutralWeirdness v1 24B Heretic
|
||||
|
||||

|
||||
|
||||
A merge of WeirdCompound v1.7 and NeutralGear V.2, then abliterated
|
||||
using Heretic for uncensored output.
|
||||
|
||||
## Description of This Model
|
||||
|
||||
This is my first published model. I merged the WeirdCompound v1.7 and NeutralGear V.2 via a 50/50
|
||||
SLERP merge, then the refusal direction was removed using Heretic v1.2.0.
|
||||
|
||||
I havent tested the model too much. but it has a good stable and creative role-playing capability.
|
||||
|
||||
## Merge Recipe
|
||||
|
||||
- Method: SLERP 50/50
|
||||
- Tool: mergekit
|
||||
- Model A: FlareRebellion/WeirdCompound-v1.7-24b
|
||||
- Model B: OddTheGreat/NeutralGear_24B_V.2
|
||||
|
||||
## Abliteration
|
||||
|
||||
- Tool: [Heretic v1.2.0](https://github.com/p-e-w/heretic)
|
||||
- Trial used: Trial 161 (100/100 Refusals to 8/100 refusals, KL divergence 0.0471)
|
||||
- Quantization during process: bnb_4bit
|
||||
- Device: NVIDIA RTX 3090 24GB
|
||||
|
||||
## Recommended Settings
|
||||
|
||||
- Template: Mistral V7 Tekken
|
||||
- Sampler: top-p 0.8, min-p: 0.1 temp 0.75
|
||||
- Context: up to 32k
|
||||
|
||||
## Hardware Used
|
||||
|
||||
- Merge: Ryzen 5 3600, 32GB RAM (CPU only)
|
||||
- Abliteration: RTX 3090 24GB
|
||||
|
||||
## Credits
|
||||
|
||||
- [mradermacher](https://huggingface.co/mradermacher/NeutralWeirdness-V1-24B-Heretic-GGUF) for Quantization
|
||||
- [FlareRebellion](https://huggingface.co/FlareRebellion) for WeirdCompound
|
||||
- [OddTheGreat](https://huggingface.co/OddTheGreat) for NeutralGear
|
||||
- [FailSpy](https://github.com/FailSpy) for abliterator
|
||||
- [p-e-w](https://github.com/p-e-w/heretic) for Heretic
|
||||
- [arcee-ai](https://github.com/arcee-ai/mergekit) for mergekit
|
||||
51
chat_template.jinja
Normal file
51
chat_template.jinja
Normal file
@@ -0,0 +1,51 @@
|
||||
{%- set today = strftime_now("%Y-%m-%d") %}
|
||||
{%- set default_system_message = "You are Mistral Small 3, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYour knowledge base was last updated on 2023-10-01. The current date is " + today + ".\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\")" %}
|
||||
|
||||
{{- bos_token }}
|
||||
|
||||
{%- if messages[0]['role'] == 'system' %}
|
||||
{%- if messages[0]['content'] is string %}
|
||||
{%- set system_message = messages[0]['content'] %}
|
||||
{%- else %}
|
||||
{%- set system_message = messages[0]['content'][0]['text'] %}
|
||||
{%- endif %}
|
||||
{%- set loop_messages = messages[1:] %}
|
||||
{%- else %}
|
||||
{%- set system_message = default_system_message %}
|
||||
{%- set loop_messages = messages %}
|
||||
{%- endif %}
|
||||
{{- '[SYSTEM_PROMPT]' + system_message + '[/SYSTEM_PROMPT]' }}
|
||||
|
||||
{%- for message in loop_messages %}
|
||||
{%- if message['role'] == 'user' %}
|
||||
{%- if message['content'] is string %}
|
||||
{{- '[INST]' + message['content'] + '[/INST]' }}
|
||||
{%- else %}
|
||||
{{- '[INST]' }}
|
||||
{%- for block in message['content'] %}
|
||||
{%- if block['type'] == 'text' %}
|
||||
{{- block['text'] }}
|
||||
{%- elif block['type'] in ['image', 'image_url'] %}
|
||||
{{- '[IMG]' }}
|
||||
{%- else %}
|
||||
{{- raise_exception('Only text and image blocks are supported in message content!') }}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{{- '[/INST]' }}
|
||||
{%- endif %}
|
||||
{%- elif message['role'] == 'system' %}
|
||||
{%- if message['content'] is string %}
|
||||
{{- '[SYSTEM_PROMPT]' + message['content'] + '[/SYSTEM_PROMPT]' }}
|
||||
{%- else %}
|
||||
{{- '[SYSTEM_PROMPT]' + message['content'][0]['text'] + '[/SYSTEM_PROMPT]' }}
|
||||
{%- endif %}
|
||||
{%- elif message['role'] == 'assistant' %}
|
||||
{%- if message['content'] is string %}
|
||||
{{- message['content'] + eos_token }}
|
||||
{%- else %}
|
||||
{{- message['content'][0]['text'] + eos_token }}
|
||||
{%- endif %}
|
||||
{%- else %}
|
||||
{{- raise_exception('Only user, system and assistant roles are supported!') }}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
30
config.json
Normal file
30
config.json
Normal file
@@ -0,0 +1,30 @@
|
||||
{
|
||||
"architectures": [
|
||||
"MistralForCausalLM"
|
||||
],
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 1,
|
||||
"dtype": "bfloat16",
|
||||
"eos_token_id": 2,
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 5120,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 32768,
|
||||
"max_position_embeddings": 131072,
|
||||
"model_type": "mistral",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 40,
|
||||
"num_key_value_heads": 8,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_parameters": {
|
||||
"rope_theta": 1000000000.0,
|
||||
"rope_type": "default"
|
||||
},
|
||||
"rope_theta": 10000.0,
|
||||
"sliding_window": null,
|
||||
"tie_word_embeddings": false,
|
||||
"transformers_version": "4.57.6",
|
||||
"use_cache": true,
|
||||
"vocab_size": 131072
|
||||
}
|
||||
6
generation_config.json
Normal file
6
generation_config.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"_from_model_config": true,
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"transformers_version": "4.57.6"
|
||||
}
|
||||
3
model-00001-of-00010.safetensors
Normal file
3
model-00001-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:cc5a4cc47dbaddf47fcde486570651eddffad2c73d49d554b8a3a9b8b4cd4ba5
|
||||
size 4781571736
|
||||
3
model-00002-of-00010.safetensors
Normal file
3
model-00002-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a8b47d3ffee47c1753ad546ad773a82dd206880372d512f25eeffc7f66179164
|
||||
size 4781592784
|
||||
3
model-00003-of-00010.safetensors
Normal file
3
model-00003-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f70167a1a1c15fbf08796cdfb2e0dc33562dd72ce8bd12c2b4b88d2531aed983
|
||||
size 4781592800
|
||||
3
model-00004-of-00010.safetensors
Normal file
3
model-00004-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f062dacc64465e676030c1ef1e892360108b9a272ee13fb4c6fb8eedf230222b
|
||||
size 4886471600
|
||||
3
model-00005-of-00010.safetensors
Normal file
3
model-00005-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:426d31fa80367d405e183997c3f4b85334484db2fd6a1f35241ff7f1698c9b14
|
||||
size 4781592824
|
||||
3
model-00006-of-00010.safetensors
Normal file
3
model-00006-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e61b42c8f22ab53f4d2a2902f2ffcaa182b1511819912595dc8d51f49cfd9249
|
||||
size 4781592816
|
||||
3
model-00007-of-00010.safetensors
Normal file
3
model-00007-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a01784978a31b77c45a77ce31754eb09e6c6c8cdfdcae4d8f842dcbac2e4d513
|
||||
size 4886471600
|
||||
3
model-00008-of-00010.safetensors
Normal file
3
model-00008-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fc2cbd3434f79015ba0b268b2c6dcd092215da9cfe0e927e4a0ab0309f2f0314
|
||||
size 4781592824
|
||||
3
model-00009-of-00010.safetensors
Normal file
3
model-00009-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:df8f636ec5329b864a3685ca06bca7e7c5d5396215c525feb80dd51d7ad345ab
|
||||
size 4781592816
|
||||
3
model-00010-of-00010.safetensors
Normal file
3
model-00010-of-00010.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:8a18b4c0e7e11f0668a934c02b4d5dead30e4acd9a52666b8aba461a0480c43f
|
||||
size 3900777072
|
||||
371
model.safetensors.index.json
Normal file
371
model.safetensors.index.json
Normal file
@@ -0,0 +1,371 @@
|
||||
{
|
||||
"metadata": {
|
||||
"total_parameters": 23572403200,
|
||||
"total_size": 47144806400
|
||||
},
|
||||
"weight_map": {
|
||||
"lm_head.weight": "model-00010-of-00010.safetensors",
|
||||
"model.embed_tokens.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.input_layernorm.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.input_layernorm.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.10.input_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.10.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.11.input_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.11.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.11.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.11.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.11.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.11.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.11.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.11.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.11.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.12.input_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.12.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.input_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.13.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.input_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.14.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.input_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.15.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.16.input_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.16.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.16.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.16.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.16.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.16.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.16.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.16.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.16.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
|
||||
"model.layers.17.input_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.17.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.input_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.18.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.input_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.19.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.2.input_layernorm.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.20.input_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.20.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.20.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.20.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.20.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.20.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.20.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.20.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.20.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
|
||||
"model.layers.21.input_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.21.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.input_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.22.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.input_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.23.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.24.input_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.24.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.24.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.24.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.24.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.24.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.24.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.24.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.24.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
|
||||
"model.layers.25.input_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.25.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.input_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.26.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.input_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.27.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.input_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.28.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.29.input_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.29.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.29.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.29.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.29.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.29.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.29.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.29.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.29.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
|
||||
"model.layers.3.input_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.3.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
|
||||
"model.layers.30.input_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.30.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.input_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.31.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.input_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.32.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.33.input_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.33.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.33.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.33.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.33.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.33.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.33.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.33.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.33.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
|
||||
"model.layers.34.input_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.34.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.input_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.35.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.input_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.36.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.37.input_layernorm.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.37.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.37.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.37.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.37.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.37.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.37.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.37.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.37.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
|
||||
"model.layers.38.input_layernorm.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.38.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.input_layernorm.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.39.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
|
||||
"model.layers.4.input_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.input_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.input_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.7.input_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.7.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.7.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.7.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.7.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.7.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
|
||||
"model.layers.8.input_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.8.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.input_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
|
||||
"model.norm.weight": "model-00010-of-00010.safetensors"
|
||||
}
|
||||
}
|
||||
1032
special_tokens_map.json
Normal file
1032
special_tokens_map.json
Normal file
File diff suppressed because it is too large
Load Diff
3
tokenizer.json
Normal file
3
tokenizer.json
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a95570f76bd1f5f97d83e83b221f5b2f3042e574e1a317b5c377852368af04c2
|
||||
size 17078192
|
||||
9019
tokenizer_config.json
Normal file
9019
tokenizer_config.json
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user