初始化项目,由ModelHub XC社区提供模型

Model: ibivibiv/bubo-bubo-13b
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-11 12:20:59 +08:00
commit 399acdfd21
21 changed files with 94166 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
bubo-bubo.png filter=lfs diff=lfs merge=lfs -text

230
README.md Normal file
View File

@@ -0,0 +1,230 @@
---
license: llama2
language:
- en
tags:
- summary
---
# Bubo Bubo 13B
![img](./bubo-bubo.png)
# Prompting
## Prompt Template for alpaca style
```
### Instruction:
<prompt> (without the <>)
### Response:
```
## Sample Code
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("ibivibiv/bubo-bubo-13b", torch_dtype="auto", device_config='auto')
tokenizer = AutoTokenizer.from_pretrained("ibivibiv/bubo-bubo-13b")
inputs = tokenizer("### Instruction: Summarize this email chain : <email chain stuff here>.\n### Response:\n", return_tensors="pt", return_attention_mask=False)
outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)
```
# Model Details
* **Trained by**: [ibivibiv](https://huggingface.co/ibivibiv)
* **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
* **Model type:** **bubo-bubo-13b** is an auto-regressive language model fine tuned on the Llama 2 transformer architecture.
* **Language(s)**: English
* **Purpose**: Has specific training for summary tasks. This model is targeted towards summarizing communication chains specifically.
# Benchmark Scores
I ran the benchmark harness, for curiousity, but this model is completely geared towards summarizing.
| Test Name | Accuracy |
|------------------------------------------------------|----------------------|
| all | 0.579149139810157 |
| arc:challenge | 0.5631399317406144 |
| hellaswag | 0.6317466640111532 |
| hendrycksTest-abstract_algebra | 0.32 |
| hendrycksTest-anatomy | 0.5481481481481482 |
| hendrycksTest-astronomy | 0.5657894736842105 |
| hendrycksTest-business_ethics | 0.55 |
| hendrycksTest-clinical_knowledge | 0.6 |
| hendrycksTest-college_biology | 0.6388888888888888 |
| hendrycksTest-college_chemistry | 0.38 |
| hendrycksTest-college_computer_science | 0.43 |
| hendrycksTest-college_mathematics | 0.34 |
| hendrycksTest-college_medicine | 0.5260115606936416 |
| hendrycksTest-college_physics | 0.3431372549019608 |
| hendrycksTest-computer_security | 0.71 |
| hendrycksTest-conceptual_physics | 0.49361702127659574 |
| hendrycksTest-econometrics | 0.35964912280701755 |
| hendrycksTest-electrical_engineering | 0.5586206896551724 |
| hendrycksTest-elementary_mathematics | 0.3439153439153439 |
| hendrycksTest-formal_logic | 0.3333333333333333 |
| hendrycksTest-global_facts | 0.42 |
| hendrycksTest-high_school_biology | 0.6903225806451613 |
| hendrycksTest-high_school_chemistry | 0.45320197044334976 |
| hendrycksTest-high_school_computer_science | 0.58 |
| hendrycksTest-high_school_european_history | 0.6787878787878788 |
| hendrycksTest-high_school_geography | 0.7424242424242424 |
| hendrycksTest-high_school_government_and_politics | 0.8341968911917098 |
| hendrycksTest-high_school_macroeconomics | 0.558974358974359 |
| hendrycksTest-high_school_mathematics | 0.3 |
| hendrycksTest-high_school_microeconomics | 0.5672268907563025 |
| hendrycksTest-high_school_physics | 0.33112582781456956 |
| hendrycksTest-high_school_psychology | 0.7577981651376147 |
| hendrycksTest-high_school_statistics | 0.4212962962962963 |
| hendrycksTest-high_school_us_history | 0.8186274509803921 |
| hendrycksTest-high_school_world_history | 0.759493670886076 |
| hendrycksTest-human_aging | 0.6547085201793722 |
| hendrycksTest-human_sexuality | 0.6412213740458015 |
| hendrycksTest-international_law | 0.6776859504132231 |
| hendrycksTest-jurisprudence | 0.75 |
| hendrycksTest-logical_fallacies | 0.6993865030674846 |
| hendrycksTest-machine_learning | 0.41964285714285715 |
| hendrycksTest-management | 0.7281553398058253 |
| hendrycksTest-marketing | 0.8504273504273504 |
| hendrycksTest-medical_genetics | 0.6 |
| hendrycksTest-miscellaneous | 0.7624521072796935 |
| hendrycksTest-moral_disputes | 0.6560693641618497 |
| hendrycksTest-moral_scenarios | 0.4346368715083799 |
| hendrycksTest-nutrition | 0.673202614379085 |
| hendrycksTest-philosophy | 0.7009646302250804 |
| hendrycksTest-prehistory | 0.7067901234567902 |
| hendrycksTest-professional_accounting | 0.4645390070921986 |
| hendrycksTest-professional_law | 0.45697522816166886 |
| hendrycksTest-professional_medicine | 0.5514705882352942 |
| hendrycksTest-professional_psychology | 0.6013071895424836 |
| hendrycksTest-public_relations | 0.6636363636363637 |
| hendrycksTest-security_studies | 0.6448979591836734 |
| hendrycksTest-sociology | 0.7611940298507462 |
| hendrycksTest-us_foreign_policy | 0.84 |
| hendrycksTest-virology | 0.4819277108433735 |
| hendrycksTest-world_religions | 0.7894736842105263 |
| truthfulqa:mc | 0.4762440289139372 |
| winogrande | 0.7616416732438832 |
| gsm8k | 0.20621683093252463 |
## Citations
```
@misc{open-llm-leaderboard,
author = {Edward Beeching and Clémentine Fourrier and Nathan Habib and Sheon Han and Nathan Lambert and Nazneen Rajani and Omar Sanseviero and Lewis Tunstall and Thomas Wolf},
title = {Open LLM Leaderboard},
year = {2023},
publisher = {Hugging Face},
howpublished = "\url{https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard}"
}
```
```
@software{eval-harness,
author = {Gao, Leo and
Tow, Jonathan and
Biderman, Stella and
Black, Sid and
DiPofi, Anthony and
Foster, Charles and
Golding, Laurence and
Hsu, Jeffrey and
McDonell, Kyle and
Muennighoff, Niklas and
Phang, Jason and
Reynolds, Laria and
Tang, Eric and
Thite, Anish and
Wang, Ben and
Wang, Kevin and
Zou, Andy},
title = {A framework for few-shot language model evaluation},
month = sep,
year = 2021,
publisher = {Zenodo},
version = {v0.0.1},
doi = {10.5281/zenodo.5371628},
url = {https://doi.org/10.5281/zenodo.5371628}
}
```
```
@misc{clark2018think,
title={Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge},
author={Peter Clark and Isaac Cowhey and Oren Etzioni and Tushar Khot and Ashish Sabharwal and Carissa Schoenick and Oyvind Tafjord},
year={2018},
eprint={1803.05457},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
```
```
@misc{zellers2019hellaswag,
title={HellaSwag: Can a Machine Really Finish Your Sentence?},
author={Rowan Zellers and Ari Holtzman and Yonatan Bisk and Ali Farhadi and Yejin Choi},
year={2019},
eprint={1905.07830},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
```
@misc{hendrycks2021measuring,
title={Measuring Massive Multitask Language Understanding},
author={Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt},
year={2021},
eprint={2009.03300},
archivePrefix={arXiv},
primaryClass={cs.CY}
}
```
```
@misc{lin2022truthfulqa,
title={TruthfulQA: Measuring How Models Mimic Human Falsehoods},
author={Stephanie Lin and Jacob Hilton and Owain Evans},
year={2022},
eprint={2109.07958},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
```
@misc{DBLP:journals/corr/abs-1907-10641,
title={{WINOGRANDE:} An Adversarial Winograd Schema Challenge at Scale},
author={Keisuke Sakaguchi and Ronan Le Bras and Chandra Bhagavatula and Yejin Choi},
year={2019},
eprint={1907.10641},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
```
@misc{DBLP:journals/corr/abs-2110-14168,
title={Training Verifiers to Solve Math Word Problems},
author={Karl Cobbe and
Vineet Kosaraju and
Mohammad Bavarian and
Mark Chen and
Heewoo Jun and
Lukasz Kaiser and
Matthias Plappert and
Jerry Tworek and
Jacob Hilton and
Reiichiro Nakano and
Christopher Hesse and
John Schulman},
year={2021},
eprint={2110.14168},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```

3
bubo-bubo.png Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:dac081d5e0a29fbacce29aa6074dab48b2872af34b23225fd6efbb5fcc0d8ba9
size 274974

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"_name_or_path": "./summary_owl",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13824,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"num_key_value_heads": 40,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.36.2",
"use_cache": true,
"vocab_size": 32000
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.36.2"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:24f246dd5bc53d7a3aa28b8d78cd2eae75fed2f77bae1bd921cc2b85fdc6adb0
size 4881247856

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f2341766671cabce9f9a22c626a34fd3ef1b1b62a0aa472b5024e40af3da7dcd
size 4970418112

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d7a36ead2c0f07cd9194c73b81abade184a7074cdf896868841a2ab1d22805ad
size 4970418120

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:36684acc8e6c38e1d9f9fc9ad0ba74a8caa83b9a4a375697f3c97c4d38ea7502
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0d124acdbe0dcea7d23563d1ae5bed5201e4b386393fcdc95119c2044da42a80
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b1f46734a10286510d264c225ff78e068f0fcb493e5ce1aef3621aa79415b0a4
size 4792119040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c087e73d91034664d4ee46c7dd3a42e2f61e0e447669c60fd028ed34c3fd040d
size 4792160232

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c6a4b847e525c2a60f28acf308237c3be3e0f56fb4753c0635b33ad09c8691ee
size 4792160224

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:dcc4ba64124b74bc867d66c0882f07b3b17eb1dc94c9840e2e2d2e1d1bd14844
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6ee3ede55f0a21dab300024df0fe73a1d8ded056e0b7f1b42f3979db0d70cbd8
size 4970418144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:27a838d5ea58ffb9ae573fc954223c12946038184991cc6d1a04d822442cce52
size 2983303184

View File

@@ -0,0 +1,370 @@
{
"metadata": {
"total_size": 52063457280
},
"weight_map": {
"lm_head.weight": "model-00011-of-00011.safetensors",
"model.embed_tokens.weight": "model-00001-of-00011.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.10.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.11.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.input_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.15.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00004-of-00011.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.input_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00005-of-00011.safetensors",
"model.layers.19.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.20.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.input_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00006-of-00011.safetensors",
"model.layers.23.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.input_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00007-of-00011.safetensors",
"model.layers.27.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.input_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00011.safetensors",
"model.layers.30.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00008-of-00011.safetensors",
"model.layers.31.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.input_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.34.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00009-of-00011.safetensors",
"model.layers.35.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.input_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.38.input_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00010-of-00011.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.input_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00011-of-00011.safetensors",
"model.layers.4.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.input_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00002-of-00011.safetensors",
"model.layers.8.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.input_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00011.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00011.safetensors",
"model.norm.weight": "model-00011-of-00011.safetensors"
}
}

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

93391
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
size 499723

41
tokenizer_config.json Normal file
View File

@@ -0,0 +1,41 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"padding_side": "right",
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": true
}