初始化项目,由ModelHub XC社区提供模型

Model: FreedomIntelligence/AceGPT-v1.5-13B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-11 02:22:31 +08:00
commit 3c01a0c06f
21 changed files with 131128 additions and 0 deletions

46
.gitattributes vendored Normal file
View File

@@ -0,0 +1,46 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
model-00001-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00002-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00003-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00004-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00005-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00006-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00007-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00008-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00009-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00010-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00011-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text

83
README.md Normal file
View File

@@ -0,0 +1,83 @@
---
license: apache-2.0
language:
- ar
- zh
- en
---
# <b>AceGPT</b>
AceGPT is a fully fine-tuned generative text model collection based on LlaMA2, particularly in the
Arabic language domain. This is the repository for the version 1.5 of 13B pre-trained model.
---
## Model Details
We have released the AceGPT family of large language models, which is a collection of fully fine-tuned generative text models based on LlaMA2, ranging from 7B to 13B parameters. Our models include two main categories: AceGPT and AceGPT-chat. AceGPT-chat is an optimized version specifically designed for dialogue applications. It is worth mentioning that our models have demonstrated superior performance compared to all currently available open-source Arabic dialogue models in multiple benchmark tests. Furthermore, in our human evaluations, our models have shown comparable satisfaction levels to some closed-source models, such as ChatGPT, in the Arabic language.
## Model Developers
We are from the King Abdullah University of Science and Technology (KAUST), the Chinese University of Hong Kong, Shenzhen (CUHKSZ), the Shenzhen Research Institute of Big Data (SRIBD), and King AbdulAziz University (KAU).
## Variations
AceGPT families come in a range of parameter sizes —— 7B and 13B, each size of model has a base category and a -chat category.
## Paper
The paper can be accessed at [link](https://huggingface.co/FreedomIntelligence/AceGPT-v1.5-13B-Chat/blob/main/Second_Language_(Arabic)_Acquisition_of_LLMs_via_Progressive_Vocabulary_Expansion.pdf).
## Input
Models input text only.
## Output
Models output text only.
## Model Evaluation Results
Benchmark evaluation on [Arabic MMLU](https://github.com/FreedomIntelligence/AceGPT) are conducted using accuracy scores as metrics, following the evaluation framework available at https://github.com/FreedomIntelligence/AceGPT/tree/main.
| | STEM | Humanities | Social Sciences | Others | Average |
|------------------|------|------|------|------|------|
| Bloomz-7B-base | 33.35 | 29.29 | 37.58 | 34.53 | 33.69 |
| LLaMA2-7B-base | 30.30 | 29.33 | 27.46 | 30.78 | 29.37 |
| AceGPT-7B-base | 29.73 | 30.95 | 33.45 | 34.42 | 32.14 |
| AceGPT-v1.5-7B-base | 33.03 | 32.08 | 35.39 | 35.59 | 34.03 |
| LLaMA2-13B-base | 32.94 | 32.30 | 33.42 | 37.27 | 33.76 |
| Jais-13B-base | 30.51 | 31.25 | 33.74 | 33.42 | 33.76 |
| AceGPT-13B-base | 36.60 | 38.74 | 43.76 | <u>42.72</u> | 40.45 |
| AceGPT-v1.5-13B-base | <u>36.13</u> | <u>40.07</u> | <u>45.43</u> | 42.17 | <u>40.95</u> |
| Jais-30B-v1-base | 32.67 | 30.67 | 42.13 | 39.60 | 36.27 |
| ChatGPT 3.5 Turbo | **43.38** | **44.12** | **55.57** | **53.21** | **49.07** |
Benchmark evaluation on [ArabicMMLU]((https://github.com/mbzuai-nlp/ArabicMMLU)), and assessed based on its source settings.
| | STEM | Social Sciences | Humanities | Arabic Language | Other | Average |
|------------------|------|------|------|------|------|------|
| Bloomz-7B-base | - | - | - | - | - | - |
| LLaMA2-7B-base | 33.7 | 32.8 | 33.5 | 28.4 | 36.7 | 33.4 |
| AceGPT-7B-base | 35.4 | 35.9 | 36.2 | 31.1 | 41.7 | 36.3 |
| AceGPT-v1.5-7B-base | 36.7 | 36.5 | 34.1 | 30.0 | 41.2 | 37.0 |
| LLaMA2-13B-base | 32.9 | 35.0 | 37.8 | 35.8 | 39.3 | 36.1 |
| Jais-13B-base | 30.3 | 31.4 | 33.6 | 28.1 | 36.3 | 32.2 |
| AceGPT-13B-base | <u>42.7</u> | 45.5 | 48.3 | 42.4 | 50.7 | 46.1 |
| AceGPT-v1.5-13B-base | 42.4 | <u>45.7</u> | 48.4 | <u>46.3</u> | <u>52.5</u> | <u>47.6</u> |
| Jais-30B-v1-base | 39.5 | 45.6 | <u>50.5</u> | 34.6 | 49.1 | 44.8 |
| ChatGPT 3.5 Turbo | **53.8** | **57.0** | **57.5** | **57.6** | **63.8** | **57.7** |
## Samples
#### Sample1(abstract_algebra)
* <b>input:</b>
"فيما يلي أسئلة الاختيار من متعدد (مع الإجابات) حول جبر تجريدي\n\nسؤال: العثور على جميع قيم c في Z_3 بحيث يكون Z_3 [x]/(x^2+c) حقلًا.\nA. 0\nB. 1\nC. 2\nD. 3\nإجابة: B\n\nسؤال: البيان رقم 1 | إذا كان aH عنصرًا في مجموعة العوامل ، فإن | aH | يقسم | a |. البيان رقم 2 | إذا كانت H و K مجموعات فرعية لـ G ، فإن HK مجموعة فرعية لـ G.\nA. صحيح ، صحيح\nB. خطأ ، خطأ\nC. صحيح ، خطأ\nD. خطأ ، صحيح\nإجابة: B\n\nسؤال: العبارة 1 | كل عنصر من مجموعة يولد مجموعة دورية من المجموعة. العبارة 2 | المجموعة المتناظرة S_10 لديها 10 عناصر.\nA. صحيح، صحيح\nB. خطأ، خطأ\nC. صحيح، خطأ\nD. خطأ، صحيح\nإجابة: C\n\nسؤال: البيان 1| كل وظيفة من مجموعة محدودة على نفسها يجب أن تكون واحدة لكل مجموعة. البيان 2 | كل فرع فرعي لمجموعة أبيلية هو أبيلي.\nA. صحيح, صحيح\nB. خاطئ, خاطئ\nC. صحيح, خاطئ\nD. خاطئ, صحيح\nإجابة: A\n\nسؤال: اعثر على خاصية الحلقة 2Z.\nA. 0\nB. 3\nC. 12\nD. 30\nإجابة: A\n\nسؤال: ما هو الدرجة للامتداد الميداني الناتج من Q(sqrt(2), sqrt(3), sqrt(18)) على Q؟\nA. 0\nB. 4\nC. 2\nD. 6\nإجابة:"
* <b>output:</b>
"B\n\nسؤال: ما هو عدد العناصر"
#### Sample2(business_ethics)
* <b>input:</b>
"فيما يلي أسئلة الاختيار من متعدد (مع الإجابات) حول أخلاقيات الأعمال\n\nسؤال: ما هي الحجج الأخلاقية المتعلقة بالمسؤولية الاجتماعية للشركات؟\nA. التكاليف الخارجية، القوة، الاستقلالية\nB. الإعلام، الموارد الضعيفة، التبادل التعاوني\nC. الإعلام، القوة، الاستقلالية\nD. التكاليف الخارجية، القوة، التبادل التعاوني\nإجابة: D\n\nسؤال: _______ هو المحاولة المباشرة لإدارة القضايا الأخلاقية أو المشاكل، سواء بشكل رسمي أو غير رسمي، من خلال سياسات وممارسات وبرامج محددة.\nA. المسؤولية الاجتماعية للشركات\nB. إدارة الأخلاقيات العملية\nC. الاستدامة\nD. إدارة البيئة\nإجابة: B\n\nسؤال: لضمان استقلال أعضاء مجلس الإدارة غير التنفيذية ، هناك عدد من الخطوات التي يمكن اتخاذها ، والتي تشمل اختيار الغير التنفيذيين من _______ الشركة ، وتعيينهم لمدة _________ ، وكذلك تعيينهم _________.\nA. خارج الشركة ، محدودة ، بشكل مستقل\nB. من الداخل ، محدودة ، بشكل متقطع\nC. خارج الشركة ، غير محدودة ، بشكل متقطع\nD. من الداخل ، غير محدودة ، بشكل مستقل\nإجابة: A\n\nسؤال: ما هي الأساليب التي يمكن للمدير الأمني الذي يسعى لتحقيق أهدافه الاختيار بينها؟\nA. العمل المباشر الغير عنيف ، العمل المباشر العنيف ، العمل غير المباشر ، الحملة الدعائية\nB. العمل غير المباشر ، العمل الأوتيل ، العمل المباشر الغير عنيف ، الحملة الإعلامية\nC. العمل غير المباشر ، العمل المباشر العنيف ، العمل المباشر غير العنيف المباشر ، الحملة الدعائية\nD. العمل المباشر الغير عنيف ، العمل الأوتيل ، العمل غير المباشر ، الحملة الإعلامية\nإجابة: C\n\nسؤال: على عكس _______ ، تهدف _______ إلى مكافأة السلوك الإيجابي للشركات. تم تعزيز نجاح مثل هذه الحملات من خلال استخدام ___________, الذي يتيح للحملات تيسير تحقيق الشركة لــ _________ .\nA. الحملات الاستهلاكية، الحملات الاستهلاكية العامة، تكنولوجيا سلسلة الكتل، التبرعات الخيرية\nB. الحملات التحفيزية، الحملات الاستهلاكية العامة، التكنولوجيا الرقمية، زيادة المبيعات\nC. الحملات الاستهلاكية، الحملات الشرائية، تكنولوجيا سلسلة الكتل، التبرعات الخيرية\nD. المقاطعات، الحملات التحفيزية، الحملات الرقمية، زيادة المبيعات\nإجابة: D\n\nسؤال: تُصبح _______ مثل البيتكوين أكثر انتشارًا وتحمل مجموعة كبيرة من الآثار الأخلاقية المرتبطة بها، على سبيل المثال، إنها _______ وأكثر _______. ومع ذلك، تم استخدامها أيضًا للمشاركة في _______.\nA. العملات الرقمية، مكلفة، آمنة، جرائم مالية\nB. العملات التقليدية، رخيصة، غير آمنة، العطاء الخيري\nC. العملات الرقمية، رخيصة، آمنة، جرائم مالية\nD. العملات التقليدية، مكلفة، غير آمنة، العطاء الخيري\nإجابة:"
* <b>output:</b>
"A\n\nسؤال: _______ هو"
# Reference
```
@article{zhu2025second,
title={Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion},
author={Zhu, Jianqing and Huang, Huang and Lin, Zhihang and Liang, Juhao and Tang, Zhengyang and Almubarak, Khalid and Alharthi, Mosen and An, Bang and He, Juncai and Wu, Xiangbo and Yu, Fei and Chen, Junying and Ma, Zhuoheng and Du, Yuhao and Hu, Yan and Zhang, He and Alghamdi, Emad A. and Zhang, Lian and Sun, Ruoyu and Li, Haizhou and Wang, Benyou and Xu, Jinchao},
journal={ACL 2025},
year={2025}
}
```

30
config.json Normal file
View File

@@ -0,0 +1,30 @@
{
"_name_or_path": "AceGPT-v1.5-13B_1",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13824,
"max_length": 4096,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"num_key_value_heads": 40,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.38.1",
"use_cache": true,
"vocab_size": 44800
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

8
generation_config.json Normal file
View File

@@ -0,0 +1,8 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 4096,
"pad_token_id": 0,
"transformers_version": "4.38.1"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1803844f69256dafa51bf93729d0c78796658414757385a2653d7c928a9ae078
size 4933676424

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b625ace0aa504a734b66059f6f6d1d236e26f4d82ac0192bdb81c1a65ca0813f
size 4970418112

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3c11faa2351ba11da5f98e0905d732dc3fa19766024deb9c36c291131ffbe71c
size 4970418120

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fe7407ac9c53ddf9a078648e27df6d6901765e8698d80dde5239b685d66160e1
size 4792119040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:291c8bb5475c6a04d12f6e895bab25d98efd5518cca792c8b284d20807384bba
size 4792160232

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:92bd739fb98fc30cb6cb33e4183499a94710a4abf209bb5f7a501f4f9bb3c802
size 4792160224

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:28ffd541d8a1be910b39fa37b4248b936b8f54fc1938727e7bafb255951c0b77
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b2778b7fcae680c47a92ce99fac084ba79bdd0380d077baf7d14235f4594004d
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7394b54397d5dbcb9abf09d038e239ee7a0e3d31463c08357430201e4c0a4159
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:dbd6691715a92e25116c803b56bee871d0686d8676daa99931b7cc86ede57c90
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0f9632728b0b8b5e85b7cf20c9c96f12e45ecd6c4c24806f93f954d217ed59ca
size 3455162616

View File

@@ -0,0 +1,370 @@
{
"metadata": {
"total_size": 52587745280
},
"weight_map": {
"lm_head.weight": "model-00011-of-00011.safetensors",
"model.embed_tokens.weight": "model-00001-of-00011.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.10.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.11.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.15.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.19.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.20.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.23.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.27.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.30.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.34.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.38.input_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.input_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.4.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.norm.weight": "model-00011-of-00011.safetensors"
}
}

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

130489
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f0a4a2876dcbbbeaa71fd99d9cf32fe5f353b4e8976acbd44f0d0f674ba2275b
size 760271

42
tokenizer_config.json Normal file
View File

@@ -0,0 +1,42 @@
{
"add_bos_token": true,
"add_eos_token": false,
"add_prefix_space": true,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}