初始化项目,由ModelHub XC社区提供模型

Model: mncai/Mistral-7B-CollectiveCognition
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-20 08:48:50 +08:00
commit bd11c485a5
13 changed files with 554 additions and 0 deletions

52
.gitattributes vendored Normal file
View File

@@ -0,0 +1,52 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
pytorch_model-00002-of-00003.bin filter=lfs diff=lfs merge=lfs -text
pytorch_model-00003-of-00003.bin filter=lfs diff=lfs merge=lfs -text
tokenizer.model filter=lfs diff=lfs merge=lfs -text
pytorch_model-00001-of-00003.bin filter=lfs diff=lfs merge=lfs -text

83
README.md Normal file
View File

@@ -0,0 +1,83 @@
---
pipeline_tag: text-generation
license: mit
language:
- en
library_name: transformers
tags:
- MindsAndCompany
datasets:
- CollectiveCognition/chats-data-2023-09-27
---
## Model Details
* **Developed by**: [Minds And Company](https://mnc.ai/)
* **Backbone Model**: [Mistral-7B-v0.1](mistralai/Mistral-7B-v0.1)
* **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
## Dataset Details
### Used Datasets
- CollectiveCognition/chats-data-2023-09-27
### Prompt Template
- Llama Prompt Template
## Limitations & Biases:
Llama2 and fine-tuned variants are a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2 and any fine-tuned varient's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2 variants, developers should perform safety testing and tuning tailored to their specific applications of the model.
Please see the Responsible Use Guide available at https://ai.meta.com/llama/responsible-use-guide/
## License Disclaimer:
This model is bound by the license & usage restrictions of the original Llama-2 model. And comes with no warranty or gurantees of any kind.
## Contact Us
- [Minds And Company](https://mnc.ai/)
## Citiation:
Please kindly cite using the following BibTeX:
```bibtex
@misc{mukherjee2023orca,
title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4},
author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
year={2023},
eprint={2306.02707},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
```
@misc{Orca-best,
title = {Orca-best: A filtered version of orca gpt4 dataset.},
author = {Shahul Es},
year = {2023},
publisher = {HuggingFace},
journal = {HuggingFace repository},
howpublished = {\url{https://huggingface.co/datasets/shahules786/orca-best/},
}
```
```
@software{touvron2023llama2,
title={Llama 2: Open Foundation and Fine-Tuned Chat Models},
author={Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava,
Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller,
Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez Madian Khabsa, Isabel Kloumann,
Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov,
Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith,
Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu , Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan,
Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom},
year={2023}
}
```
> Readme format: [Riiid/sheep-duck-llama-2-70b-v1.1](https://huggingface.co/Riiid/sheep-duck-llama-2-70b-v1.1)

6
added_tokens.json Normal file
View File

@@ -0,0 +1,6 @@
{
"</s>": 2,
"<s>": 1,
"<unk>": 0,
"[PAD]": 32000
}

28
config.json Normal file
View File

@@ -0,0 +1,28 @@
{
"_name_or_path": "/data/models/mistral/CollectiveCognition_chats-data-2023-09-27_/batch1_epochs4_lr1e-05_paged_adamw_32bit_cosine_warmup_0.05_max_grad1.0_grad_accu16",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"sliding_window": 4096,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.34.0",
"use_cache": true,
"vocab_size": 32000
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

6
generation_config.json Normal file
View File

@@ -0,0 +1,6 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"transformers_version": "4.34.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9b8a0b9fac9280e6b8201649a4cd5e13e4b9f47a4bf1ce7ffc12d99649170a58
size 9886335397

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:699b56e2913a1b3088c073d28c5ab8566bb46fce24bb8b645af1c395c50853ef
size 9999650241

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:38c2e2f11aec4eedb6710cabc6eed2cd3ff9f7fb9000e3673495c71c359dae72
size 9081042299

View File

@@ -0,0 +1,298 @@
{
"metadata": {
"total_size": 28966928384
},
"weight_map": {
"lm_head.weight": "pytorch_model-00003-of-00003.bin",
"model.embed_tokens.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.11.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.20.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.23.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.30.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.norm.weight": "pytorch_model-00003-of-00003.bin"
}
}

12
special_tokens_map.json Normal file
View File

@@ -0,0 +1,12 @@
{
"additional_special_tokens": [
"<unk>",
"<s>",
"</s>",
"[PAD]"
],
"bos_token": "<s>",
"eos_token": "</s>",
"pad_token": "[PAD]",
"unk_token": "<unk>"
}

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

56
tokenizer_config.json Normal file
View File

@@ -0,0 +1,56 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32000": {
"content": "[PAD]",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<unk>",
"<s>",
"</s>",
"[PAD]"
],
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "[PAD]",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"tokenizer_file": null,
"unk_token": "<unk>",
"use_default_system_prompt": true
}