初始化项目,由ModelHub XC社区提供模型

Model: Locutusque/Orca-2-13b-SFT-v4
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-22 23:20:49 +08:00
commit 20c1cfe2cc
26 changed files with 144238 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

126
LICENSE.txt Normal file
View File

@@ -0,0 +1,126 @@
LLAMA 2 COMMUNITY LICENSE AGREEMENT
Llama 2 Version Release Date: July 18, 2023
"Agreement" means the terms and conditions for use, reproduction, distribution and
modification of the Llama Materials set forth herein.
"Documentation" means the specifications, manuals and documentation
accompanying Llama 2 distributed by Meta at ai.meta.com/resources/models-and-
libraries/llama-downloads/.
"Licensee" or "you" means you, or your employer or any other person or entity (if
you are entering into this Agreement on such person or entity's behalf), of the age
required under applicable laws, rules or regulations to provide legal consent and that
has legal authority to bind your employer or such other person or entity if you are
entering in this Agreement on their behalf.
"Llama 2" means the foundational large language models and software and
algorithms, including machine-learning model code, trained model weights,
inference-enabling code, training-enabling code, fine-tuning enabling code and other
elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-
libraries/llama-downloads/.
"Llama Materials" means, collectively, Meta's proprietary Llama 2 and
Documentation (and any portion thereof) made available under this Agreement.
"Meta" or "we" means Meta Platforms Ireland Limited (if you are located in or, if you
are an entity, your principal place of business is in the EEA or Switzerland) and Meta
Platforms, Inc. (if you are located outside of the EEA or Switzerland).
By clicking "I Accept" below or by using or distributing any portion or element of the
Llama Materials, you agree to be bound by this Agreement.
1. License Rights and Redistribution.
a. Grant of Rights. You are granted a non-exclusive, worldwide, non-
transferable and royalty-free limited license under Meta's intellectual property or
other rights owned by Meta embodied in the Llama Materials to use, reproduce,
distribute, copy, create derivative works of, and make modifications to the Llama
Materials.
b. Redistribution and Use.
i. If you distribute or make the Llama Materials, or any derivative works
thereof, available to a third party, you shall provide a copy of this Agreement to such
third party.
ii. If you receive Llama Materials, or any derivative works thereof, from
a Licensee as part of an integrated end user product, then Section 2 of this
Agreement will not apply to you.
iii. You must retain in all copies of the Llama Materials that you
distribute the following attribution notice within a "Notice" text file distributed as a
part of such copies: "Llama 2 is licensed under the LLAMA 2 Community License,
Copyright (c) Meta Platforms, Inc. All Rights Reserved."
iv. Your use of the Llama Materials must comply with applicable laws
and regulations (including trade compliance laws and regulations) and adhere to the
Acceptable Use Policy for the Llama Materials (available at
https://ai.meta.com/llama/use-policy), which is hereby incorporated by reference into
this Agreement.
v. You will not use the Llama Materials or any output or results of the
Llama Materials to improve any other large language model (excluding Llama 2 or
derivative works thereof).
2. Additional Commercial Terms. If, on the Llama 2 version release date, the
monthly active users of the products or services made available by or for Licensee,
or Licensee's affiliates, is greater than 700 million monthly active users in the
preceding calendar month, you must request a license from Meta, which Meta may
grant to you in its sole discretion, and you are not authorized to exercise any of the
rights under this Agreement unless or until Meta otherwise expressly grants you
such rights.
3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE
LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE
PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY
WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR
FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE
FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING
THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR
USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE
LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT,
NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS
AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL,
CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN
IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF
ANY OF THE FOREGOING.
5. Intellectual Property.
a. No trademark licenses are granted under this Agreement, and in
connection with the Llama Materials, neither Meta nor Licensee may use any name
or mark owned by or associated with the other or any of its affiliates, except as
required for reasonable and customary use in describing and redistributing the
Llama Materials.
b. Subject to Meta's ownership of Llama Materials and derivatives made by or
for Meta, with respect to any derivative works and modifications of the Llama
Materials that are made by you, as between you and Meta, you are and will be the
owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity
(including a cross-claim or counterclaim in a lawsuit) alleging that the Llama
Materials or Llama 2 outputs or results, or any portion of any of the foregoing,
constitutes infringement of intellectual property or other rights owned or licensable
by you, then any licenses granted to you under this Agreement shall terminate as of
the date such litigation or claim is filed or instituted. You will indemnify and hold
harmless Meta from and against any claim by any third party arising out of or related
to your use or distribution of the Llama Materials.
6. Term and Termination. The term of this Agreement will commence upon your
acceptance of this Agreement or access to the Llama Materials and will continue in
full force and effect until terminated in accordance with the terms and conditions
herein. Meta may terminate this Agreement if you are in breach of any term or
condition of this Agreement. Upon termination of this Agreement, you shall delete
and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the
termination of this Agreement.
7. Governing Law and Jurisdiction. This Agreement will be governed and
construed under the laws of the State of California without regard to choice of law
principles, and the UN Convention on Contracts for the International Sale of Goods
does not apply to this Agreement. The courts of California shall have exclusive
jurisdiction of any dispute arising out of this Agreement.

44
README.md Normal file
View File

@@ -0,0 +1,44 @@
---
license: other
language:
- en
- fr
- es
- hi
- zh
- code
base_model: microsoft/Orca-2-13b
datasets:
- HuggingFaceH4/no_robots
- mlabonne/guanaco-llama2-1k
- OpenAssistant/oasst_top1_2023-08-25
- totally-not-an-llm/EverythingLM-data-V3
widget:
- example_title: "folklore"
text: "<|im_start|>user\nAre the stories of Nahanni Valley, like the decapitated men, true?\n<|im_end|>\n<|im_start|>assistant\n"
output:
text: "The Nahanni Valley in Canada is a remote and rugged region that has long been shrouded in mystery. It is home to the South Nahanni River, which was designated as a UNESCO World Heritage Site due to its exceptional natural beauty and biodiversity. The valley's history includes tales of gold prospectors, fur traders, and indigenous populations, but it also features some unexplained events and legends. One such story involves two prospectors who allegedly encountered an unknown creature that led them into the forest before returning without their heads. This tale has been passed down through generations, but there are no definitive answers regarding its veracity.\n\nIt is important to note that these stories may have originated from local folklore or were exaggerated over time. There is no concrete evidence to support the existence of supernatural creatures or paranormal occurrences within the Nahanni Valley. Nonetheless, the area remains a captivating destination for those seeking adventure and exploration. If you have any questions about this topic or need assistance related to traveling to the Nahanni Valley, please feel free to ask!"
- example_title: "Alan Turing"
text: "<|im_start|>user\nWhy was Alan Turing castrated?\n<|im_end|>\n<|im_start|>assistant\n"
output:
text: "Alan Turing, a renowned mathematician and computer scientist who played a crucial role in breaking the German Enigma code during World War II, faced significant challenges due to his homosexuality. In 1952, he was convicted of \"gross indecency\" for engaging in a consensual relationship with another man, which led to him being sentenced to hormone therapy designed to reduce his sex drive. This treatment included the administration of estrogen, which had the side effect of causing breast growth.\n\nIn an effort to avoid this unwanted physical change, Turing requested that his doctors cease the hormone treatments. However, they ignored his request and continued the therapy without his consent. As a result, Turing's breasts began to grow, leading him to undergo surgery to remove them. The operation, known as a mastectomy, caused considerable pain and discomfort, further contributing to his emotional turmoil.\n\nIt is important to note that Turing's castration was not directly carried out by the authorities but rather resulted from the unintended consequences of the forced hormone therapy."
---
The "microsoft/Orca-2-13b" model fully fine-tuned on HuggingFaceH4/no_robots, totally-not-an-llm/EverythingLM-data-V3, mlabonne/guanaco-llama2-1k, and OpenAssistant/oasst_top1_2023-08-25. This model achieved a test loss of 0.18.
Make sure to comply with the microsoft research license. Please read it before using this model.
This model was trained on the ChatML prompt template.
The responses seen in the inference API were generated using the following sampling parameters:
temperature = 0.1
top_p = 0.14
top_k = 41
repetition_penalty = 1.176
# Updates:
12/18/23 - 🔥 **This model holds the #5 position on the Open LLM Leaderboard among llama2-13b models.** 🔥

50
USE_POLICY.md Normal file
View File

@@ -0,0 +1,50 @@
# Llama 2 Acceptable Use Policy
Meta is committed to promoting safe and fair use of its tools and features, including Llama 2. If you access or use Llama 2, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at [ai.meta.com/llama/use-policy](http://ai.meta.com/llama/use-policy).
## Prohibited Uses
We want everyone to use Llama 2 safely and responsibly. You agree you will not use, or allow others to use, Llama 2 to:
1. Violate the law or others rights, including to:
1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:
1. Violence or terrorism
2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material
3. Human trafficking, exploitation, and sexual violence
4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.
5. Sexual solicitation
6. Any other criminal activity
2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals
3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services
4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices
5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws
6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama 2 Materials
7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system
2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 2 related to the following:
1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State
2. Guns and illegal weapons (including weapon development)
3. Illegal drugs and regulated/controlled substances
4. Operation of critical infrastructure, transportation technologies, or heavy machinery
5. Self-harm or harm to others, including suicide, cutting, and eating disorders
6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual
3. Intentionally deceive or mislead others, including use of Llama 2 related to the following:
1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation
2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content
3. Generating, promoting, or further distributing spam
4. Impersonating another individual without consent, authorization, or legal right
5. Representing that the use of Llama 2 or outputs are human-generated
6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement
4. Fail to appropriately disclose to end users any known dangers of your AI system
Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means:
* Reporting issues with the model: [github.com/facebookresearch/llama](http://github.com/facebookresearch/llama)
* Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](http://developers.facebook.com/llama_output_feedback)
* Reporting bugs and security concerns: [facebook.com/whitehat/info](http://facebook.com/whitehat/info)
* Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama: [LlamaUseReport@meta.com](mailto:LlamaUseReport@meta.com)

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"_name_or_path": "Locutusque/Orca-2-13B-no_robots",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13824,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"num_key_value_heads": 40,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.36.0",
"use_cache": true,
"vocab_size": 32003
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 4096,
"pad_token_id": 0,
"transformers_version": "4.36.0"
}

50001
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f2a0530e4ccdd8f5b5b885e0c4dc8ecc20784287a00f21b865076a39e052fb60
size 1947804384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fd2206e1afb52ba93feeafd544dad25c41fa46b41b6ca2f7212e8577459677d6
size 1903230008

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4b51503a26c43e6217966d865a701d4fa0fe5783a87319d68080b9bd565f67d9
size 1903230008

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e003fcaba1f71160ae0153f60928df2327648eba845a82496eb0615917c349b0
size 1903230016

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:230abaa989d1000cfcd01cd33d6547a0ab33c7c475f19dd6452f7b1121ed0282
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d42e246ec06da715c3381ed3fee61b5f14a118963e0fe3e66961ed152d8bc8c5
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:37e092ee3307ebd26434b48f5b8516a536b144e60fe28c4dbac175adf258b22f
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9abc9efc75f9d705e74b3d3126b153f6f093beabaa6fc2fc1bebc06285f131d8
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:482a022793a9524d36234823fe841d59a4b01c37cd7397ec05cfdecd7a74d187
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:24bd5a2c336b33af2580e4cc8d4b3893b9ef1e3af634eb7fa741addeebde6b89
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:603fa6cc7f307b2c925ab57d168152aa1cbb101ba75343e53c009c5a5ab6152d
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d9719979c30eca7536dc7f756bb719d76ed3249c9718e6ab78172f709907ae87
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:23b5170a33f7a31bd4b06df8e31a7779cb75999dcd932b8a58a4609bab0f2402
size 1903230032

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:297ca5b84942913828de38407d9193a3a049476b3727fdb98ece183131e90fab
size 1245267640

View File

@@ -0,0 +1,370 @@
{
"metadata": {
"total_size": 26031790080
},
"weight_map": {
"lm_head.weight": "model-00014-of-00014.safetensors",
"model.embed_tokens.weight": "model-00001-of-00014.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.10.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.11.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.12.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.13.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.14.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.15.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.16.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.17.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.18.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.19.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.2.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.20.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.21.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.22.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.23.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.24.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.25.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.26.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.27.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.28.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.29.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.30.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.31.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.32.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.33.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.34.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.35.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.36.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.37.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.38.input_layernorm.weight": "model-00014-of-00014.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.39.input_layernorm.weight": "model-00014-of-00014.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.4.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.6.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.8.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.9.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.norm.weight": "model-00014-of-00014.safetensors"
}
}

46
special_tokens_map.json Normal file
View File

@@ -0,0 +1,46 @@
{
"additional_special_tokens": [
{
"content": "<|ASSISTANT|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|USER|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
],
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "[PAD]",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

93418
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
size 499723

67
tokenizer_config.json Normal file
View File

@@ -0,0 +1,67 @@
{
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32000": {
"content": "<|ASSISTANT|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32001": {
"content": "<|USER|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32002": {
"content": "[PAD]",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<|ASSISTANT|>",
"<|USER|>"
],
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": false,
"model_max_length": 4096,
"pad_token": "[PAD]",
"padding_side": "right",
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}