初始化项目,由ModelHub XC社区提供模型

Model: rombodawg/LosslessMegaCoder-llama2-7b-mini
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-12 08:19:28 +08:00
commit 794a851a5a
17 changed files with 94030 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

107
README.md Normal file
View File

@@ -0,0 +1,107 @@
---
license: llama2
datasets:
- rombodawg/LosslessMegaCodeTrainingV2_1m_Evol_Uncensored
---
___________________________
- Please note this model was not trained on the rombodawg/LosslessMegaCodeTrainingV3_MINI dataset, despite the name similarity. You can find the training data at the bottom of the model card labeled (megacode2-min100)
___________________________
This is one of the first models trained on the LosslessMegaCodeTrainingV2_1m_Evol_Uncensored dataset. The version of the dataset used for this model was filtered by removed any data with less than 100 tokens but plans for much more refined filtering are in the works
- This model was made as a colaboration between me and andreaskoepf who is an affiliate of Open Assistant.
This model is extremely good at coding, and might be one of the best coding models for its size and much better than any 7b parameter model. Plans for bigger models are coming in the future.
### Prompt template
[chatml](https://github.com/openai/openai-python/blob/main/chatml.md) format is used:
"<|im_start|>system\n{system message}<|im_end|>\n<|im_start|>user\n{user prompt}<|im_end|>\n<|im_start|>assistant\n{Assistant answer}<|im_end|>\n"
multi-line:
```
<|im_start|>system
{system message}<|im_end|>
<|im_start|>user
{user prompt}<|im_end|>
<|im_start|>assistant
{Assistant answer}<|im_end|>
```
Gpt4all template:
- System prompt
```
<|im_start|>system
"Below is an instruction that describes a task. Write a response that appropriately completes the request."
```
- Prompt template
```
<|im_end|>
<|im_start|>user
"%1"<|im_end|>
<|im_start|>assistant
```
Oobagooba Text-Generation-Webui Template
- user:
```
<|im_start|>user
{User string}<|im_end|>
```
- bot:
```
<|im_start|>assistant
{Bot string}<|im_end|>
```
- turn_template:
```
<|user|>\n<|user-message|>\n\n<|bot|>\n<|bot-message|>\n\n
```
- context:
```
<|im_start|>system
Below is an instruction that describes a task. Write a response that appropriately completes the request.<|im_end|>
```
Current quatizations available:
- https://huggingface.co/TheBloke/LosslessMegaCoder-Llama2-7B-Mini-GPTQ
Benchmarks for the model can be found at the link bellow the model here is called (andreaskoepf/llama2-7b-megacode2_min100)
- https://tju01.github.io/FastEval-OpenAssistant/
Sampling report:
https://open-assistant.github.io/oasst-model-eval/?f=https%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-pretrained%2F2023-08-12_andreaskoepf_llama2-7b-megacode2_min100_sampling_noprefix2.json
Training information:
- https://wandb.ai/open-assistant/public-sft/runs/run17_megacode_min100
The link for the full dataset is bellow:
- https://huggingface.co/datasets/rombodawg/LosslessMegaCodeTrainingV2_1m_Evol_Uncensored
Link for the filtered dataset used to make this model are bellow:
- https://huggingface.co/datasets/andreaskoepf/megacode2-min100
The original posting for this model was uploaded at the link bellow.
- https://huggingface.co/andreaskoepf/llama2-7b-megacode2_min100
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_rombodawg__LosslessMegaCoder-llama2-7b-mini)
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | 45.33 |
| ARC (25-shot) | 53.5 |
| HellaSwag (10-shot) | 77.38 |
| MMLU (5-shot) | 49.72 |
| TruthfulQA (0-shot) | 45.77 |
| Winogrande (5-shot) | 74.03 |
| GSM8K (5-shot) | 9.55 |
| DROP (3-shot) | 7.34 |

25
config.json Normal file
View File

@@ -0,0 +1,25 @@
{
"architectures": [
"LlamaForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 32006,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 32,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.31.0",
"use_cache": true,
"vocab_size": 32007
}

6
generation_config.json Normal file
View File

@@ -0,0 +1,6 @@
{
"bos_token_id": 1,
"eos_token_id": 32006,
"pad_token_id": 0,
"transformers_version": "4.31.0"
}

13
huggingface-metadata.txt Normal file
View File

@@ -0,0 +1,13 @@
url: https://huggingface.co/andreaskoepf/llama2-7b-megacode2_min100
branch: main
download date: 2023-08-12 22:23:38
sha256sum:
fe688706d5cdbd6379124c171c7c6f3c84c3763e505182d4d5cb1efe3ed7a1b8 pytorch_model-00001-of-00008.bin
4d2011ed5d037437d553c4e1560a0fc684d4414286a52eebbf5e4807e43016ca pytorch_model-00002-of-00008.bin
f5799d91309b7392105e1f7afbdc84e079448fb4ec0dc2c07f60e6fed6ca5e54 pytorch_model-00003-of-00008.bin
223d9ddf59b6974a22bbcf08a948a9b8a639604a141c220a1ace47eabb25df28 pytorch_model-00004-of-00008.bin
5dddfdc359df7a3bdf6f5c8cbd2bd5e81b203ae62da2837a3283645a9c39f230 pytorch_model-00005-of-00008.bin
57bff4043363eb7af7f5d4bd2b07d0951c4bedf1ade47290e23d35c27d15cd52 pytorch_model-00006-of-00008.bin
a25d72a4d6068ff81c8d3bda8f8dbf5362aa76da38d803bfee7b7d85fb692aeb pytorch_model-00007-of-00008.bin
721a078f52a1b5680372ef91f75c95490cad5596c3365607de69b2b91dd8f95b pytorch_model-00008-of-00008.bin
9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347 tokenizer.model

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fe688706d5cdbd6379124c171c7c6f3c84c3763e505182d4d5cb1efe3ed7a1b8
size 1914838174

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4d2011ed5d037437d553c4e1560a0fc684d4414286a52eebbf5e4807e43016ca
size 1900102374

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f5799d91309b7392105e1f7afbdc84e079448fb4ec0dc2c07f60e6fed6ca5e54
size 1843496394

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:223d9ddf59b6974a22bbcf08a948a9b8a639604a141c220a1ace47eabb25df28
size 1923187238

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5dddfdc359df7a3bdf6f5c8cbd2bd5e81b203ae62da2837a3283645a9c39f230
size 1900102438

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:57bff4043363eb7af7f5d4bd2b07d0951c4bedf1ade47290e23d35c27d15cd52
size 1843496394

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a25d72a4d6068ff81c8d3bda8f8dbf5362aa76da38d803bfee7b7d85fb692aeb
size 1889640934

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:721a078f52a1b5680372ef91f75c95490cad5596c3365607de69b2b91dd8f95b
size 262202757

View File

@@ -0,0 +1,330 @@
{
"metadata": {
"total_size": 13476954112
},
"weight_map": {
"lm_head.weight": "pytorch_model-00008-of-00008.bin",
"model.embed_tokens.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.0.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.1.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.10.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.10.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.11.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.12.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.13.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.13.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.14.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.14.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.15.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.16.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.input_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.17.self_attn.rotary_emb.inv_freq": "pytorch_model-00004-of-00008.bin",
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.18.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00004-of-00008.bin",
"model.layers.18.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.19.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.2.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.20.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.20.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.input_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.21.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.22.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.22.self_attn.rotary_emb.inv_freq": "pytorch_model-00005-of-00008.bin",
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00005-of-00008.bin",
"model.layers.23.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.23.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.24.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.25.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.input_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.26.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.27.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.27.self_attn.rotary_emb.inv_freq": "pytorch_model-00006-of-00008.bin",
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00006-of-00008.bin",
"model.layers.28.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.28.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.29.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.3.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00008.bin",
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.30.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.30.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.input_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.31.self_attn.rotary_emb.inv_freq": "pytorch_model-00007-of-00008.bin",
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00007-of-00008.bin",
"model.layers.4.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00008.bin",
"model.layers.4.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.5.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.6.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.input_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.7.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.8.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.8.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00008.bin",
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00002-of-00008.bin",
"model.layers.9.input_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.layers.9.self_attn.rotary_emb.inv_freq": "pytorch_model-00003-of-00008.bin",
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00003-of-00008.bin",
"model.norm.weight": "pytorch_model-00007-of-00008.bin"
}
}

93454
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
size 499723

33
tokenizer_config.json Normal file
View File

@@ -0,0 +1,33 @@
{
"bos_token": {
"__type": "AddedToken",
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"clean_up_tokenization_spaces": false,
"eos_token": {
"__type": "AddedToken",
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"padding_side": "right",
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": {
"__type": "AddedToken",
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}