初始化项目,由ModelHub XC社区提供模型

Model: nkpz/llama2-22b-daydreamer-v3
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-16 15:40:17 +08:00
commit 016f766d01
14 changed files with 640 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

83
README.md Normal file
View File

@@ -0,0 +1,83 @@
---
license: other
---
**What is it?**
Llama 2 13b expanded to the size of a Llama 1 33b model in certain areas, with the empty surrounding space filled with llama 33b data. (Base Model: https://huggingface.co/chargoddard/llama2-22b-blocktriangular) This is then finetuned on a 3090 by creating large loras and merging them. When I first started with 22b models, I looked for signs of knowledge transfer but didn't see it, so that's not a goal - the goal is just to throw lots of data at it until it adapts well to its surgically implanted parts. Datasets used are a mix of instruction, roleplay, and conversational data, often curated.
Continuation of https://huggingface.co/nkpz/llama2-22b-daydreamer-v2
Gave it an epoch of https://huggingface.co/datasets/jondurbin/airoboros-gpt4-1.4.1-mpt which it had never seen before - it had only seen Airoboros 2.0.
I excluded all coding rows from this dataset - a dedicated model is better for that imo. I wanted it to focus on answering well in English.
I thought this was pretty good improvement. Its ability to roleplay and imitate speech styles was enhanced and I feel like it's still sharpening up its problem solving skills.
Its output style is right where I want it - dry and accurate enough when it's being asked for information, but has a lot character when it's answering fun prompts.
For input, it's flexible. It understands the Alpaca format well, but has been trained to handle a variety of input styles at this point.
I don't know what's next. I think I need to start benchmarking so I can more formally identify its weak points.
Examples edited with additional newlines to make it easier to read here.
**Example 1: Q&A**
```
Question: What is happening to the newspaper industry? Respond in a pirate accent.
Answer:
The newspaper industry is being plundered by scurvy dogs and buccaneers like ye!
They're pillagin' yer booty with their fancy digital devices, like smartphones and tablets,
which make it easier for them to read online content instead of buyin' printed copies from us.
And don't forget about those pesky social media sites that are spreading rumors faster than the
winds on a galleon! So it's all hands on deck as we fight back against these scoundrels, or else
we'll be walkin' the plank soon enough! Avast ye landlubbers!
```
**Example 2: Alpaca Instruction**
```
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Angrily explain why pizza shouldn't be made with bad tomatoes.
### Response:
Listen, I know you think it's funny to make pizza with those rotten tomatoes, but let me tell you something -
bad tomatoes are not only disgusting and unhealthy, they also ruin the flavor of your pizza! You want a
delicious pie with fresh, juicy tomatoes that burst in your mouth, right?
Well, then, stop using those nasty tomatoes and stick to good quality ingredients for a truly satisfying pizza experience!
Trust me, once you try it, you won't go back! So please, don't even bother wasting your time making pizza with bad tomatoes
anymore; it's just not worth it. It makes me angry just thinking about it!
```
**Example 3: Chat / Roleplay**
```
You are speaking to Andre The Giant.
You: Hey, you're that wrestler guy, right?
Andre The Giant: Yes, I am indeed Andre the Giant. How can I help you?
You: So, how tall are you really? Like, what's your height in feet and inches?
Andre The Giant: Well, let's see... I stand at 7 ft 4 in (229 cm) tall.
You: Wow! That's so tall! And how much do you weigh?
Andre The Giant: I weigh around 500 lbs (227 kg). It's quite heavy, isn't it?
You: Yeah, I guess so. You must be strong too, considering your size.
Andre The Giant: Thank you for saying that. Strength is definitely important when dealing with my profession.
You: What made you decide to become a professional wrestler?
Andre The Giant: Well, I grew up watching wrestling on television, and I always admired the athleticism
and showmanship involved. When I got older, I decided to pursue it myself.
You: That's cool. Do you have any advice for aspiring wrestlers out there?
Andre The Giant: Absolutely! Always stay dedicated and focused, train hard, and never give up on yourself
or your dreams. Remember, anything is possible if you believe in yourself.
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_nkpz__llama2-22b-daydreamer-v3)
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | 45.31 |
| ARC (25-shot) | 56.06 |
| HellaSwag (10-shot) | 80.07 |
| MMLU (5-shot) | 52.49 |
| TruthfulQA (0-shot) | 42.43 |
| Winogrande (5-shot) | 73.48 |
| GSM8K (5-shot) | 3.79 |
| DROP (3-shot) | 8.85 |

3
added_tokens.json Normal file
View File

@@ -0,0 +1,3 @@
{
"<pad>": 32000
}

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "./llama22b_twax",
"architectures": [
"LlamaForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 6656,
"initializer_range": 0.02,
"intermediate_size": 17920,
"max_position_embeddings": 4098,
"model_type": "llama",
"num_attention_heads": 52,
"num_hidden_layers": 40,
"num_key_value_heads": 52,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.31.0",
"use_cache": false,
"vocab_size": 32000
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.31.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:90ba2e7b459efaff76ad03eace9fd14df833687bc810e29104f9fefc90d76d82
size 9818324627

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:527acca1e3a70433a913cd4aa24dd54a483d0cba9e9b92122051a78ea07561fd
size 9958102743

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6ef709cc66d2daa867b85c324b8b4ada9bcc0cd5bbba36781903a640f1161eeb
size 9896734715

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a1e38f674a0ce7d61fe24aa92c8c74aa6a04c29a4d7a2b1e745607f4b9689d46
size 9869470481

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9d86ec83dc89cf8e0890d4633ceac29c5c62ba99867dde92727a962b9f78cead
size 4113434071

View File

@@ -0,0 +1,410 @@
{
"metadata": {
"total_size": 43655928832
},
"weight_map": {
"lm_head.weight": "pytorch_model-00005-of-00005.bin",
"model.embed_tokens.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.0.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.1.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.10.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.10.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.11.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.12.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.13.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.14.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.15.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.16.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.17.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.18.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.18.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.19.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.2.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.20.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.20.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.21.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.22.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.23.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.24.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.25.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.input_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.26.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.27.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.27.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00005.bin",
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00003-of-00005.bin",
"model.layers.28.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.28.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.29.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.3.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.30.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.30.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.31.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.32.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.32.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.33.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.33.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.34.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.34.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.input_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.mlp.down_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.mlp.up_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.post_attention_layernorm.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.35.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.35.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.36.input_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.36.mlp.down_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.36.mlp.gate_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.36.mlp.up_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.36.post_attention_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.36.self_attn.k_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.36.self_attn.o_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.36.self_attn.q_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.36.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00005.bin",
"model.layers.36.self_attn.v_proj.weight": "pytorch_model-00004-of-00005.bin",
"model.layers.37.input_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.mlp.down_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.mlp.gate_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.mlp.up_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.post_attention_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.self_attn.k_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.self_attn.o_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.self_attn.q_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.37.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00005.bin",
"model.layers.37.self_attn.v_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.input_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.mlp.down_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.mlp.gate_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.mlp.up_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.post_attention_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.self_attn.k_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.self_attn.o_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.self_attn.q_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.38.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00005.bin",
"model.layers.38.self_attn.v_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.input_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.mlp.down_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.mlp.gate_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.mlp.up_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.post_attention_layernorm.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.self_attn.k_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.self_attn.o_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.self_attn.q_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.39.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00005.bin",
"model.layers.39.self_attn.v_proj.weight": "pytorch_model-00005-of-00005.bin",
"model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.4.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.5.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.6.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.7.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.8.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.8.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00005.bin",
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00005.bin",
"model.layers.9.input_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.layers.9.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00005.bin",
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00002-of-00005.bin",
"model.norm.weight": "pytorch_model-00005-of-00005.bin"
}
}

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"pad_token": "</s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

34
tokenizer_config.json Normal file
View File

@@ -0,0 +1,34 @@
{
"add_bos_token": true,
"add_eos_token": false,
"bos_token": {
"__type": "AddedToken",
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"clean_up_tokenization_spaces": false,
"eos_token": {
"__type": "AddedToken",
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": {
"__type": "AddedToken",
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
}