初始化项目,由ModelHub XC社区提供模型

Model: FreedomIntelligence/AceGPT-v1.5-13B-Chat
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-26 08:23:15 +08:00
commit 7ee70b6d55
22 changed files with 131121 additions and 0 deletions

47
.gitattributes vendored Normal file
View File

@@ -0,0 +1,47 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
Second_Language_(Arabic)_Acquisition_of_LLMs_via_Progressive_Vocabulary_Expansion.pdf filter=lfs diff=lfs merge=lfs -text
model-00001-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00002-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00003-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00004-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00005-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00006-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00007-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00008-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00009-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00010-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text
model-00011-of-00011.safetensors filter=lfs diff=lfs merge=lfs -text

72
README.md Normal file
View File

@@ -0,0 +1,72 @@
---
license: apache-2.0
language:
- ar
- zh
- en
---
# <b>AceGPT</b>
AceGPT is a fully fine-tuned generative text model collection based on LlaMA2, particularly in the
Arabic language domain. This is the repository for the version 1.5 of 13B-chat pre-trained model.
---
## Model Details
We have released the AceGPT family of large language models, which is a collection of fully fine-tuned generative text models based on LlaMA2, ranging from 7B to 13B parameters. Our models include two main categories: AceGPT and AceGPT-chat. AceGPT-chat is an optimized version specifically designed for dialogue applications. It is worth mentioning that our models have demonstrated superior performance compared to all currently available open-source Arabic dialogue models in multiple benchmark tests. Furthermore, in our human evaluations, our models have shown comparable satisfaction levels to some closed-source models, such as ChatGPT, in the Arabic language.
## Model Developers
We are from the King Abdullah University of Science and Technology (KAUST), the Chinese University of Hong Kong, Shenzhen (CUHKSZ), the Shenzhen Research Institute of Big Data (SRIBD), and King AbdulAziz University (KAU).
## Variations
AceGPT families come in a range of parameter sizes —— 7B and 13B, each size of model has a base category and a -chat category.
## Paper
The paper can be accessed at [link](https://huggingface.co/FreedomIntelligence/AceGPT-v1.5-13B-Chat/blob/main/Second_Language_(Arabic)_Acquisition_of_LLMs_via_Progressive_Vocabulary_Expansion.pdf).
## Input
Models input text only.
## Output
Models output text only.
## Model Evaluation Results
Benchmark evaluations are conducted using accuracy or F1 scores as metrics, following the evaluation framework available at https://github.com/FreedomIntelligence/AceGPT/tree/main.
([**ArabicMMLU**](https://github.com/mbzuai-nlp/ArabicMMLU) is assessed based on its source settings.)
| | [**MMLU** (Huang et al. (2023))](https://github.com/FreedomIntelligence/AceGPT) | [ArabicMMLU](https://github.com/mbzuai-nlp/ArabicMMLU) | EXAMS | ACVA (clean) | ACVA (all) | BoolQ (trans) | ARC-C (trans) | Average |
|------------------|------|------|------|------|------|------|------|------|
| LLaMA2-7B-chat | 13.78 | 33.40 | 13.05 | 20.99 | 21.80 | 34.92 | 23.72 | 21.09 |
| Phoenix-7b | 29.72 | 44.74 | 31.93 | 43.80 | 41.86 | 66.70 | 33.53 | 41.75 |
| AceGPT-7B-chat | 30.69 | 36.31 | 33.73 | 53.87 | 53.07 | 60.70 | 38.05 | 43.77 |
| Mistral-7B-Instruct-v0.2 | 27.93 | 41.44 | 21.56 | 64.56 | 63.47 | 60.18 | 35.67 | 44.97 |
| **AceGPT-v1.5-7B-chat** | 45.77 | 56.62 | 43.69 | 69.46 | 70.86 | 72.45 | <u>60.49</u> | 59.90 |
| Jais-13B-chat | 19.52 | 54.83 | 19.71 | 66.75 | 61.41 | 41.25 | 11.95 | 39.34 |
| Llama2-13B-chat | 8.92 | 36.12 | 16.11 | 35.12 | 35.71 | 54.13 | 27.47 | 30.51 |
| AceGPT-13B-chat | 35.59 | 52.61 | 38.72 | 70.82 | 70.21 | 66.85 | 44.20 | 54.14 |
| **AceGPT-v1.5-13B-chat** | **47.33** | <u>61.70</u> | **48.37** | **76.90** | <u>76.37</u> | 69.33 | **63.99** | **63.42** |
| Jais-30B-chat-v1 | 38.12 | 59.33 | 40.45 | <u>74.46</u> | 72.41 | 73.76 | 50.94 | 58.49 |
| Jais-30B-chat-v3 | 35.68 | **62.36** | 32.24 | 73.63 | 73.66 | **76.30** | 51.02 | 57.84 |
| ChatGPT 3.5 Turbo | <u>46.07</u> | 57.72 | <u>45.63</u> | 74.45 | **76.88** | <u>76.12</u> | 60.24 | <u>62.44</u> |
## Samples
#### Sample1(abstract_algebra)
* <b>input:</b>
"<User>: فيما يلي أسئلة الاختيار من متعدد حول جبر تجريدي\n\nسؤال: ما هو الدرجة للامتداد الميداني الناتج من Q(sqrt(2), sqrt(3), sqrt(18)) على Q؟\nA. 0\nB. 4\nC. 2\nD. 6\nمن فضلك اختر إجابة واحدة من بين 'A، B، C، D' دون شرح. <Assistant>: "
* <b>output:</b>
"B\n\nالشرح:\n\nالامت"
#### Sample2(business_ethics)
* <b>input:</b>
"<User>: فيما يلي أسئلة الاختيار من متعدد حول أخلاقيات الأعمال\n\nسؤال: تُصبح _______ مثل البيتكوين أكثر انتشارًا وتحمل مجموعة كبيرة من الآثار الأخلاقية المرتبطة بها، على سبيل المثال، إنها _______ وأكثر _______. ومع ذلك، تم استخدامها أيضًا للمشاركة في _______.\nA. العملات الرقمية، مكلفة، آمنة، جرائم مالية\nB. العملات التقليدية، رخيصة، غير آمنة، العطاء الخيري\nC. العملات الرقمية، رخيصة، آمنة، جرائم مالية\nD. العملات التقليدية، مكلفة، غير آمنة، العطاء الخيري\nمن فضلك اختر إجابة واحدة من بين 'A، B، C، D' دون شرح. <Assistant>: "
* <b>output:</b>
"C\n\nالشرح:\n\nالإ"
# Reference
```
@article{zhu2025second,
title={Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion},
author={Zhu, Jianqing and Huang, Huang and Lin, Zhihang and Liang, Juhao and Tang, Zhengyang and Almubarak, Khalid and Alharthi, Mosen and An, Bang and He, Juncai and Wu, Xiangbo and Yu, Fei and Chen, Junying and Ma, Zhuoheng and Du, Yuhao and Hu, Yan and Zhang, He and Alghamdi, Emad A. and Zhang, Lian and Sun, Ruoyu and Li, Haizhou and Wang, Benyou and Xu, Jinchao},
journal={ACL 2025},
year={2025}
}
```

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e41fbb8db18cb92486dc6e13edc6873fb73cd1dbb24382d756e7397b6530ae35
size 3652174

30
config.json Normal file
View File

@@ -0,0 +1,30 @@
{
"_name_or_path": "AceGPT-v1.5-13B-Chat_1",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13824,
"max_length": 4096,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"num_key_value_heads": 40,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.38.1",
"use_cache": true,
"vocab_size": 44800
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

8
generation_config.json Normal file
View File

@@ -0,0 +1,8 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"max_length": 4096,
"pad_token_id": 0,
"transformers_version": "4.38.1"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:51be9cdadceb83943841887d9b935e20c363b6b4cc9d5082ab7c1e37e62eb85b
size 4933676424

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9b224430706b1d7f704096c86f21c2446eb0f0e1a0739a593521e8f564a7dbc6
size 4970418112

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:128fca1031c83ba815fa3d4f976d92f134f23bf284522685ead2b264cf305fc2
size 4970418120

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a5abc64e128d88cb993af1942b52b700453c0155098da00a431ec231a6efcbfe
size 4792119040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9fed522b2bb7cbea30034b375206a64a2e007d73309d17feff29bd126c818c4f
size 4792160232

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8345fe71ccbc9cb64205dcd1e6a316034c87a6cb5014bef010876f7ca390a2bf
size 4792160224

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b9c20982235cb1379033c7afaf1485a68d1f4a82e9bfe2e1accf939d60f97395
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:01ba4a41720b8f86de300dddf5e48d9094e144095a370b7ff13374858c4f69d4
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a385aa89ad90eb3d09a0bc3c8583c3b0e36a2c91b465533e6bfcca926341657f
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a61ae63f551761b2fdec57bb12c17b11ef99fef94304d4b1726bb8ee1ad6b0f8
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ba2339fd638d838bbeecd85f334864597db079223c9cce4355b74c37a40bb07a
size 3455162616

View File

@@ -0,0 +1,370 @@
{
"metadata": {
"total_size": 52587745280
},
"weight_map": {
"lm_head.weight": "model-00011-of-00011.safetensors",
"model.embed_tokens.weight": "model-00001-of-00011.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.10.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.11.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.15.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.19.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.20.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.23.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.27.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.30.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.34.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.38.input_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.input_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.4.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.norm.weight": "model-00011-of-00011.safetensors"
}
}

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

130489
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f0a4a2876dcbbbeaa71fd99d9cf32fe5f353b4e8976acbd44f0d0f674ba2275b
size 760271

42
tokenizer_config.json Normal file
View File

@@ -0,0 +1,42 @@
{
"add_bos_token": true,
"add_eos_token": false,
"add_prefix_space": true,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}