初始化项目,由ModelHub XC社区提供模型

Model: Norquinal/llama-2-7b-claude-chat-rp
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-13 17:35:22 +08:00
commit 80cf2930ed
11 changed files with 93903 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

38
README.md Normal file
View File

@@ -0,0 +1,38 @@
---
datasets:
- Norquinal/claude_pippa_bluemoon_2k
---
This is a [LLaMA-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) model fine-tuned using QLoRA (4-bit precision) on my [claude_pippa_bluemoon_2k](https://huggingface.co/datasets/Norquinal/claude_pippa_bluemoon_2k) dataset, which is a randomized subset of ~1000 samples from my [claude_multiround_chat_30k](https://huggingface.co/datasets/Norquinal/claude_multiround_chat_30k) dataset combined with an additional ~600 randomized samplings from the [PIPPA](https://huggingface.co/datasets/PygmalionAI/PIPPA) and [Bluemoon RP](https://huggingface.co/datasets/Squish42/bluemoon-fandom-1-1-rp-cleaned) datasets. The intent was to improve the finetune's ability to RP and write creative fiction. To that end, it seems to have worked, if only marginally.
Do not take this model very seriously, it is probably not very good. I haven't a clue of what I'm doing. I just thought it was a fun thing to make.
## Prompt Format
The model was finetuned with the Vicuna 1.1 prompt format:
```
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
USER: {instruction}
ASSISTANT:
```
## Example Usage
>USER: Write a story about a boy who wakes up in the middle of a magical forest.
>ASSISTANT: Once upon a time there was a young boy named Jack. One night he had a dream that he was walking through a beautiful enchanted forest filled with talking animals and strange creatures. When he opened his eyes it turned out to be real! He found himself standing in the center of this mysterious place surrounded by tall trees and lush greenery.
>
>The first thing Jack did was ask for directions from one of the many colorful birds flying overhead. They told him that if he followed the path straight ahead he would eventually come across a village inhabited by friendly fairies. So off he went down the well-worn trail leading deeper into the woods. After several hours of hiking he finally reached the edge of the fairy settlement where tiny houses made entirely of flowers dotted the landscape like jewels on a crown.
>
>Jack knocked at the door of the largest cottage which belonged to the king and queen of these mystical folk. Upon entering they welcomed him warmly as their newest guest. Over tea and scones they explained how magic flowed freely throughout their land but only those pure of heart could see its beauty. Jack promised not to tell anyone else about what he had witnessed here so long as he might return someday when needed most. With that assurance given, the fairies bid him goodnight and sent him back along the same path he came from - now knowing more than ever before just how special life truly is.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Norquinal__llama-2-7b-claude-chat-rp)
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | 44.78 |
| ARC (25-shot) | 54.95 |
| HellaSwag (10-shot) | 80.05 |
| MMLU (5-shot) | 47.03 |
| TruthfulQA (0-shot) | 43.47 |
| Winogrande (5-shot) | 74.74 |
| GSM8K (5-shot) | 7.28 |
| DROP (3-shot) | 5.94 |

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "NousResearch/Llama-2-7b-hf",
"architectures": [
"LlamaForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 32,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.31.0",
"use_cache": true,
"vocab_size": 32000
}

9
generation_config.json Normal file
View File

@@ -0,0 +1,9 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 32000,
"temperature": 0.9,
"top_p": 0.6,
"transformers_version": "4.31.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b763e713569c78c2505b4aad7fbf167ddf97b45b7fbabd7b5093aa1c85d8c5ae
size 9976637886

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:173c98494d11053198a22f26510e133ffccb3c788a43c58bbfdce3503f1f75ed
size 3500316627

View File

@@ -0,0 +1,330 @@
{
"metadata": {
"total_size": 13476839424
},
"weight_map": {
"lm_head.weight": "pytorch_model-00002-of-00002.bin",
"model.embed_tokens.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.0.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.1.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.10.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.11.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.12.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.13.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.14.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.15.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.16.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.17.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.18.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.19.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.2.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.20.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.21.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.22.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.23.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.24.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.24.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.25.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.26.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.27.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.28.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.29.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.3.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.30.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.30.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.31.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
"model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.4.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.5.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.6.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.7.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.8.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.layers.9.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
"model.norm.weight": "pytorch_model-00002-of-00002.bin"
}
}

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"pad_token": "</s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
}

93400
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
size 499723

32
tokenizer_config.json Normal file
View File

@@ -0,0 +1,32 @@
{
"bos_token": {
"__type": "AddedToken",
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"clean_up_tokenization_spaces": false,
"eos_token": {
"__type": "AddedToken",
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": {
"__type": "AddedToken",
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
}