初始化项目,由ModelHub XC社区提供模型
Model: rombodawg/LosslessMegaCoder-llama2-7b-mini Source: Original Platform
This commit is contained in:
35
.gitattributes
vendored
Normal file
35
.gitattributes
vendored
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
107
README.md
Normal file
107
README.md
Normal file
@@ -0,0 +1,107 @@
|
|||||||
|
---
|
||||||
|
license: llama2
|
||||||
|
datasets:
|
||||||
|
- rombodawg/LosslessMegaCodeTrainingV2_1m_Evol_Uncensored
|
||||||
|
---
|
||||||
|
___________________________
|
||||||
|
- Please note this model was not trained on the rombodawg/LosslessMegaCodeTrainingV3_MINI dataset, despite the name similarity. You can find the training data at the bottom of the model card labeled (megacode2-min100)
|
||||||
|
___________________________
|
||||||
|
|
||||||
|
This is one of the first models trained on the LosslessMegaCodeTrainingV2_1m_Evol_Uncensored dataset. The version of the dataset used for this model was filtered by removed any data with less than 100 tokens but plans for much more refined filtering are in the works
|
||||||
|
|
||||||
|
- This model was made as a colaboration between me and andreaskoepf who is an affiliate of Open Assistant.
|
||||||
|
|
||||||
|
This model is extremely good at coding, and might be one of the best coding models for its size and much better than any 7b parameter model. Plans for bigger models are coming in the future.
|
||||||
|
|
||||||
|
### Prompt template
|
||||||
|
|
||||||
|
[chatml](https://github.com/openai/openai-python/blob/main/chatml.md) format is used:
|
||||||
|
"<|im_start|>system\n{system message}<|im_end|>\n<|im_start|>user\n{user prompt}<|im_end|>\n<|im_start|>assistant\n{Assistant answer}<|im_end|>\n"
|
||||||
|
|
||||||
|
multi-line:
|
||||||
|
|
||||||
|
```
|
||||||
|
<|im_start|>system
|
||||||
|
{system message}<|im_end|>
|
||||||
|
<|im_start|>user
|
||||||
|
{user prompt}<|im_end|>
|
||||||
|
<|im_start|>assistant
|
||||||
|
{Assistant answer}<|im_end|>
|
||||||
|
```
|
||||||
|
|
||||||
|
Gpt4all template:
|
||||||
|
|
||||||
|
- System prompt
|
||||||
|
```
|
||||||
|
<|im_start|>system
|
||||||
|
"Below is an instruction that describes a task. Write a response that appropriately completes the request."
|
||||||
|
```
|
||||||
|
- Prompt template
|
||||||
|
```
|
||||||
|
<|im_end|>
|
||||||
|
<|im_start|>user
|
||||||
|
"%1"<|im_end|>
|
||||||
|
<|im_start|>assistant
|
||||||
|
```
|
||||||
|
|
||||||
|
Oobagooba Text-Generation-Webui Template
|
||||||
|
- user:
|
||||||
|
```
|
||||||
|
<|im_start|>user
|
||||||
|
{User string}<|im_end|>
|
||||||
|
```
|
||||||
|
- bot:
|
||||||
|
```
|
||||||
|
<|im_start|>assistant
|
||||||
|
{Bot string}<|im_end|>
|
||||||
|
```
|
||||||
|
- turn_template:
|
||||||
|
```
|
||||||
|
<|user|>\n<|user-message|>\n\n<|bot|>\n<|bot-message|>\n\n
|
||||||
|
```
|
||||||
|
- context:
|
||||||
|
```
|
||||||
|
<|im_start|>system
|
||||||
|
Below is an instruction that describes a task. Write a response that appropriately completes the request.<|im_end|>
|
||||||
|
```
|
||||||
|
|
||||||
|
Current quatizations available:
|
||||||
|
|
||||||
|
- https://huggingface.co/TheBloke/LosslessMegaCoder-Llama2-7B-Mini-GPTQ
|
||||||
|
|
||||||
|
Benchmarks for the model can be found at the link bellow the model here is called (andreaskoepf/llama2-7b-megacode2_min100)
|
||||||
|
|
||||||
|
- https://tju01.github.io/FastEval-OpenAssistant/
|
||||||
|
|
||||||
|
Sampling report:
|
||||||
|
|
||||||
|
https://open-assistant.github.io/oasst-model-eval/?f=https%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-pretrained%2F2023-08-12_andreaskoepf_llama2-7b-megacode2_min100_sampling_noprefix2.json
|
||||||
|
|
||||||
|
Training information:
|
||||||
|
|
||||||
|
- https://wandb.ai/open-assistant/public-sft/runs/run17_megacode_min100
|
||||||
|
|
||||||
|
The link for the full dataset is bellow:
|
||||||
|
|
||||||
|
- https://huggingface.co/datasets/rombodawg/LosslessMegaCodeTrainingV2_1m_Evol_Uncensored
|
||||||
|
|
||||||
|
Link for the filtered dataset used to make this model are bellow:
|
||||||
|
|
||||||
|
- https://huggingface.co/datasets/andreaskoepf/megacode2-min100
|
||||||
|
|
||||||
|
The original posting for this model was uploaded at the link bellow.
|
||||||
|
|
||||||
|
- https://huggingface.co/andreaskoepf/llama2-7b-megacode2_min100
|
||||||
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
||||||
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_rombodawg__LosslessMegaCoder-llama2-7b-mini)
|
||||||
|
|
||||||
|
| Metric | Value |
|
||||||
|
|-----------------------|---------------------------|
|
||||||
|
| Avg. | 45.33 |
|
||||||
|
| ARC (25-shot) | 53.5 |
|
||||||
|
| HellaSwag (10-shot) | 77.38 |
|
||||||
|
| MMLU (5-shot) | 49.72 |
|
||||||
|
| TruthfulQA (0-shot) | 45.77 |
|
||||||
|
| Winogrande (5-shot) | 74.03 |
|
||||||
|
| GSM8K (5-shot) | 9.55 |
|
||||||
|
| DROP (3-shot) | 7.34 |
|
||||||
25
config.json
Normal file
25
config.json
Normal file
@@ -0,0 +1,25 @@
|
|||||||
|
{
|
||||||
|
"architectures": [
|
||||||
|
"LlamaForCausalLM"
|
||||||
|
],
|
||||||
|
"bos_token_id": 1,
|
||||||
|
"eos_token_id": 32006,
|
||||||
|
"hidden_act": "silu",
|
||||||
|
"hidden_size": 4096,
|
||||||
|
"initializer_range": 0.02,
|
||||||
|
"intermediate_size": 11008,
|
||||||
|
"max_position_embeddings": 4096,
|
||||||
|
"model_type": "llama",
|
||||||
|
"num_attention_heads": 32,
|
||||||
|
"num_hidden_layers": 32,
|
||||||
|
"num_key_value_heads": 32,
|
||||||
|
"pad_token_id": 0,
|
||||||
|
"pretraining_tp": 1,
|
||||||
|
"rms_norm_eps": 1e-05,
|
||||||
|
"rope_scaling": null,
|
||||||
|
"tie_word_embeddings": false,
|
||||||
|
"torch_dtype": "bfloat16",
|
||||||
|
"transformers_version": "4.31.0",
|
||||||
|
"use_cache": true,
|
||||||
|
"vocab_size": 32007
|
||||||
|
}
|
||||||
6
generation_config.json
Normal file
6
generation_config.json
Normal file
@@ -0,0 +1,6 @@
|
|||||||
|
{
|
||||||
|
"bos_token_id": 1,
|
||||||
|
"eos_token_id": 32006,
|
||||||
|
"pad_token_id": 0,
|
||||||
|
"transformers_version": "4.31.0"
|
||||||
|
}
|
||||||
13
huggingface-metadata.txt
Normal file
13
huggingface-metadata.txt
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
url: https://huggingface.co/andreaskoepf/llama2-7b-megacode2_min100
|
||||||
|
branch: main
|
||||||
|
download date: 2023-08-12 22:23:38
|
||||||
|
sha256sum:
|
||||||
|
fe688706d5cdbd6379124c171c7c6f3c84c3763e505182d4d5cb1efe3ed7a1b8 pytorch_model-00001-of-00008.bin
|
||||||
|
4d2011ed5d037437d553c4e1560a0fc684d4414286a52eebbf5e4807e43016ca pytorch_model-00002-of-00008.bin
|
||||||
|
f5799d91309b7392105e1f7afbdc84e079448fb4ec0dc2c07f60e6fed6ca5e54 pytorch_model-00003-of-00008.bin
|
||||||
|
223d9ddf59b6974a22bbcf08a948a9b8a639604a141c220a1ace47eabb25df28 pytorch_model-00004-of-00008.bin
|
||||||
|
5dddfdc359df7a3bdf6f5c8cbd2bd5e81b203ae62da2837a3283645a9c39f230 pytorch_model-00005-of-00008.bin
|
||||||
|
57bff4043363eb7af7f5d4bd2b07d0951c4bedf1ade47290e23d35c27d15cd52 pytorch_model-00006-of-00008.bin
|
||||||
|
a25d72a4d6068ff81c8d3bda8f8dbf5362aa76da38d803bfee7b7d85fb692aeb pytorch_model-00007-of-00008.bin
|
||||||
|
721a078f52a1b5680372ef91f75c95490cad5596c3365607de69b2b91dd8f95b pytorch_model-00008-of-00008.bin
|
||||||
|
9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347 tokenizer.model
|
||||||
3
pytorch_model-00001-of-00008.bin
Normal file
3
pytorch_model-00001-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:fe688706d5cdbd6379124c171c7c6f3c84c3763e505182d4d5cb1efe3ed7a1b8
|
||||||
|
size 1914838174
|
||||||
3
pytorch_model-00002-of-00008.bin
Normal file
3
pytorch_model-00002-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:4d2011ed5d037437d553c4e1560a0fc684d4414286a52eebbf5e4807e43016ca
|
||||||
|
size 1900102374
|
||||||
3
pytorch_model-00003-of-00008.bin
Normal file
3
pytorch_model-00003-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:f5799d91309b7392105e1f7afbdc84e079448fb4ec0dc2c07f60e6fed6ca5e54
|
||||||
|
size 1843496394
|
||||||
3
pytorch_model-00004-of-00008.bin
Normal file
3
pytorch_model-00004-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:223d9ddf59b6974a22bbcf08a948a9b8a639604a141c220a1ace47eabb25df28
|
||||||
|
size 1923187238
|
||||||
3
pytorch_model-00005-of-00008.bin
Normal file
3
pytorch_model-00005-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:5dddfdc359df7a3bdf6f5c8cbd2bd5e81b203ae62da2837a3283645a9c39f230
|
||||||
|
size 1900102438
|
||||||
3
pytorch_model-00006-of-00008.bin
Normal file
3
pytorch_model-00006-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:57bff4043363eb7af7f5d4bd2b07d0951c4bedf1ade47290e23d35c27d15cd52
|
||||||
|
size 1843496394
|
||||||
3
pytorch_model-00007-of-00008.bin
Normal file
3
pytorch_model-00007-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:a25d72a4d6068ff81c8d3bda8f8dbf5362aa76da38d803bfee7b7d85fb692aeb
|
||||||
|
size 1889640934
|
||||||
3
pytorch_model-00008-of-00008.bin
Normal file
3
pytorch_model-00008-of-00008.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:721a078f52a1b5680372ef91f75c95490cad5596c3365607de69b2b91dd8f95b
|
||||||
|
size 262202757
|
||||||
330
pytorch_model.bin.index.json
Normal file
330
pytorch_model.bin.index.json
Normal file
@@ -0,0 +1,330 @@
|
|||||||
|
{
|
||||||
|
"metadata": {
|
||||||
|
"total_size": 13476954112
|
||||||
|
},
|
||||||
|
"weight_map": {
|
||||||
|
"lm_head.weight": "pytorch_model-00008-of-00008.bin",
|
||||||
|
"model.embed_tokens.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.10.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.13.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.13.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.14.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.18.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
|
||||||
|
"model.layers.18.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.20.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.22.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.22.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
|
||||||
|
"model.layers.23.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.27.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.27.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
|
||||||
|
"model.layers.28.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.30.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
|
||||||
|
"model.layers.4.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
|
||||||
|
"model.layers.4.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.8.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.8.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
|
||||||
|
"model.layers.9.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
|
||||||
|
"model.norm.weight": "pytorch_model-00007-of-00008.bin"
|
||||||
|
}
|
||||||
|
}
|
||||||
93454
tokenizer.json
Normal file
93454
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
3
tokenizer.model
Normal file
3
tokenizer.model
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
|
||||||
|
size 499723
|
||||||
33
tokenizer_config.json
Normal file
33
tokenizer_config.json
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
{
|
||||||
|
"bos_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "<s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"clean_up_tokenization_spaces": false,
|
||||||
|
"eos_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "</s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"legacy": false,
|
||||||
|
"model_max_length": 1000000000000000019884624838656,
|
||||||
|
"pad_token": null,
|
||||||
|
"padding_side": "right",
|
||||||
|
"sp_model_kwargs": {},
|
||||||
|
"tokenizer_class": "LlamaTokenizer",
|
||||||
|
"unk_token": {
|
||||||
|
"__type": "AddedToken",
|
||||||
|
"content": "<unk>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
}
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user