初始化项目,由ModelHub XC社区提供模型

Model: capleaf/T-Llama
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-02 23:26:40 +08:00
commit fdb8478915
43 changed files with 133801 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

488
README.md Normal file
View File

@@ -0,0 +1,488 @@
---
language:
- vi
- en
license: llama2
model-index:
- name: T-Llama
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 54.18
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=1TuanPham/T-Llama
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 76.48
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=1TuanPham/T-Llama
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 47.98
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=1TuanPham/T-Llama
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 46.47
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=1TuanPham/T-Llama
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 71.27
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=1TuanPham/T-Llama
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 29.64
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=1TuanPham/T-Llama
name: Open LLM Leaderboard
---
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/QFhLKQlWeyO9XumtyghVo.jpeg" alt="Image" style="width: 400px; height: auto; border-radius: 10px;" />
</p>
## Model Details
- **Developed by:** Tuan Pham (FPTU HCM Student)
- Contact me at: weekend.2810@gmail.com or tuanpmse160561@fpt.edu.vn
- Looking for intern opportunity :D
- **Model type:** Llama2-7B Decoder-only
- **Finetuned from model :**
* meta-llama/Llama-2-7b
* bkai-foundation-models/vietnamese-llama2-7b-120GB
* yeen214/llama2_7b_merge_orcafamily.
- **Bilingual support :** English and Vietnamese
### Model Description
<!-- Provide a longer summary of what this model is. -->
This model is a proof of effort that one man can fine-tune his own model to reach SOTA.
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:**
* Training: https://github.com/vTuanpham/Vietnamese_QA_System
* Data: https://github.com/vTuanpham/Large_dataset_translator
- **Paper:** ...
- **Demo:** ...
- Here is a kaggle and colab script to quickly test the model:
* https://www.kaggle.com/code/tuanphamm/t-llama-test
* https://colab.research.google.com/drive/1Y-f0E6C_gN_Iy72UN3-Y_c5RdPgrPai-?usp=sharing
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Prompt template
```
[SYSTEM_PROMPT]
####### Instruction:
[INPUT]
%%%%%%% Response:
[RESPONSE]
```
Recommend keeping the system prompt in english.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
import torch
from torch.cuda.amp import autocast
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline
def prompt_format(system_prompt, instruction):
prompt = f"""{system_prompt}
####### Instruction:
{instruction}
%%%%%%% Response:
"""
return prompt
system_prompt = """
You're an AI Large Language Model developed(created) by an AI developer named Tuấn, the architecture of you is decoder-based LM, your task are to think loudly step by step before give a good and relevant response
to the user request, answer in the language the user preferred.
The AI has been trained to answer questions, provide recommendations, and help with decision making. The AI thinks outside the box and follows the user requests
"""
instruction = "Xin chào"
formatted_prompt = prompt_format(system_prompt, instruction)
print(formatted_prompt)
model_name = "1TuanPham/T-Llama"
model = AutoModelForCausalLM.from_pretrained(model_name,
torch_dtype=torch.bfloat16,
use_cache=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, streamer=streamer)
with autocast():
output_default = pipe(formatted_prompt, pad_token_id=50256, max_new_tokens=128)
```
Example outputs:
```
> Xin chào
Xin chào! Tôi là một AI được phát triển bởi một AI nhà phát triển tên là Tuấn. Tôi được thiết kế để giúp đỡ người dùng bằng cách trả lời các câu hỏi, đưa ra đề xuất và hỗ trợ trong quá trình ra quyết định.
Tôi có thể hỗ trợ bạn bằng cách nghĩ ra các câu trả lời hay và phù hợp cho các câu hỏi của bạn.
> Việt Nam có cà phê gì nổi tiếng thế giới ?
Cà phê robusta của Việt Nam nổi tiếng trên toàn cầu. Loại cà phê này được trồng chủ yếu ở các vùng đất cao như Lâm Đồng, Gia Lai, KonTum và có hương vị đậm đà, đắng hơn so với cà phê Arabica.
Cà phê robusta thường được sử dụng để pha chế các loại cà phê espresso và phin truyền thống.
> Tôi có 500$ hãy build cho tôi một bộ PC
Để xây dựng một bộ PC với 500 USD, bạn có thể bắt đầu bằng cách chọn các thành phần chính như CPU, GPU, bộ nhớ và bộ lưu trữ. Sau đó, bạn có thể thêm các thành phần bổ sung như loa, bàn phím và chuột để hoàn thiện hệ thống.
Dưới đây là một số gợi ý về cách bạn có thể cấu thành một bộ PC với 500 USD:
1. CPU: Intel Core i5-12400 hoặc AMD Ryzen 5 5600X là những lựa chọn tốt để bắt đầu.
2. GPU: Nvidia GeForce GTX 1650 hoặc AMD Radeon RX 570 là những lựa chọn tốt để bắt đầu.
3. Bộ nhớ: 8GB RAM là một khởi đầu tốt.
4. Bộ lưu trữ: 500 GB ổ cứng SSD là một khởi đầu tốt.
5. Bàn phím và chuột: Bàn phím và chuột có thể được mua riêng hoặc cùng một bộ.
6. Loa: Một loa có thể được mua riêng hoặc cùng một bộ với bàn phím và chuột.
> Cho tôi 3 thức uống để giúp ngủ ngon hơn
Chắc chắn! Dưới đây là ba thức uống có thể giúp bạn ngủ ngon hơn:
1. Mật ong và chanh: Trộn một thìa mật ong với một thìa nước cốt chanh và uống trước khi đi ngủ. Mật ong có tác dụng làm dịu, trong khi chanh có chứa chất chống oxy hóa có thể giúp cải thiện chất lượng giấc ngủ.
2. Trà thảo dược: Một số loại trà thảo dược như hoa cúc, hoa oải hương và hoa sen có tác dụng thư giãn và giúp bạn chìm vào giấc ngủ. Hãy thử pha một ấm trà thảo dược ấm và uống trước khi đi ngủ.
3. Sữa ấm: Sữa ấm có chứa tryptophan, một loại hormone giúp bạn cảm thấy dễ chịu và thư giãn. Nó cũng có thể giúp thúc đẩy sản xuất melatonin, hormone giúp bạn chìm vào giấc ngủ. Hãy thử uống một cốc sữa ấm trước khi đi ngủ để giúp bạn ngủ ngon hơn.
```
The translated data has been thoroughly cleaned from the coding task, so in theory, the model should be able to code normally by incorporating the coding capabilities from the English coding examples task.
(The translation will translate code definitions and variables.)
I would not recommend running any code generated from this model.
````
> Viết cho tôi một đoạn code bằng python xây dựng một ứng dụng bằng Flask hiển thị 'Xin chào thế giới' trên màn hình
Để viết một đoạn mã Python xây dựng một ứng dụng bằng Flask hiển thị 'Xin chào thế giới' trên màn hình, bạn có thể làm theo các bước sau:
1. Cài đặt Flask và các thư viện cần thiết:
```
pip install flask
```
2. Tạo một ứng dụng Flask mới:
```
app = Flask(__name__)
```
3. Thêm một route để hiển thị 'Xin chào thế giới':
```
@app.route('/')
def hello():
return 'Xin chào thế giới!'
```
4. Khởi tạo ứng dụng Flask:
```
app.run()
```
5. Chạy ứng dụng Flask:
```
app.run(debug=True)
```
6. Mở trình duyệt và truy cập vào địa chỉ IP hoặc tên máy chủ của máy tính hoặc máy chủ nơi bạn đã khởi động ứng dụng Flask.
````
These examples are sub-optimal as they were generated with greedy decoding to ensure reproducibility.
Here's the recommended configuration:
* top_k=50
* top_p=0.5
* temperature=0.5
* penalty_alpha=0.7
* repetition_penalty=1.15
Here is an example using the above generation config:
````
> Tôi buồn ngủ quá, cơ mà thứ 7 phải thi rồi, nay là rạng sáng thứ 6. Tôi có nên đi ngủ không trong khi tôi mới xong chưa được 1/3 bài để ôn ?
Nếu bạn cảm thấy mệt mỏi và cần nghỉ ngơi thì việc đi ngủ sớm sẽ rất quan trọng đối với sức khỏe của bạn. Tuy nhiên, nếu bạn vẫn muốn hoàn thành một phần bài tập để chuẩn bị cho kỳ thi sắp tới,
hãy cân nhắc thời gian còn lại của mình và ưu tiên những nhiệm vụ quan trọng nhất trước khi đi ngủ. Nếu bạn cảm thấy rằng mình đã làm đủ, hãy dành chút thời gian thư giãn hoặc thực hiện các
hoạt động khác để giúp giảm bớt căng thẳng và tăng cường sự tỉnh táo của bạn vào buổi sáng hôm sau. Cuối cùng, quyết định cuối cùng phụ thuộc vào mức độ nghiêm trọng của kỳ thi và khả năng quản
lý thời gian của bạn.
# temperature=0.7
> Em ăn cơm chưa ?
Không, tôi không ăn cơm. Tôi là một mô hình ngôn ngữ máy được phát triển bởi một nhà phát triển AI tên là Tuấn và được đào tạo để trả lời các câu hỏi và cung cấp phản hồi hữu ích cho người dùng.
Mục tiêu của tôi là hỗ trợ người dùng bằng cách suy nghĩ trước khi đưa ra những phản hồi phù hợp và chính xác nhất có thể.
> Viết cho tôi một ứng dụng flask in ra màn hình 'Xin chào Việt Nam!'
Để viết ứng dụng Flask hiển thị "Xin chào Việt Nam!" trên màn hình, chúng ta có thể làm theo các bước sau:
1. Cài đặt Flask và các thư viện cần thiết như BeautifulSoup4 và jsonify. Bạn có thể sử dụng pip để cài đặt chúng:
```
pip install Flask
pip install beautifulsoup4
pip install jsonify
```
2. Tạo một tệp mới trong thư mục của bạn với tên `app.py`. Đây là mã mẫu cơ bản của ứng dụng:
```python
from bs4 import BeautifulSoup
from jsonify import jsonify
import requests
app = Flask(__name__)
@app.route('/')
def home():
html_string = """<h1>Chào mừng đến với Việt Nam!</h1>
<p>Đây là một trang web dành cho những ai muốn tìm hiểu về đất nước Việt Nam xinh đẹp.</p>"""
soup = BeautifulSoup(html_string, 'html.parser')
print(soup)
return soup
if __name__ == '__main__':
app.run()
```
3. Khởi tạo ứng dụng bằng cách chạy `app.py`:
```
python app.py
```
4. Mở trình duyệt web và truy cập vào địa chỉ IP hoặc tên máy chủ của ứng dụng (ví dụ: http://localhost:5000). Trang đầu sẽ hiển thị thông báo "Xin chào Việt Nam!".
Lưu ý rằng đây chỉ là ví dụ đơn giản và bạn có thể tùy chỉnh ứng dụng để đáp ứng nhu cầu cụ thể của mình.
> Liệt kê cho tôi 5 thứ cần thiết để đem đi cắm trại
Chắc chắn! Dưới đây là năm vật dụng cần thiết khi đi cắm trại:
1. Lều - Một chiếc lều chắc chắn và thoải mái sẽ cung cấp nơi trú ẩn khỏi thời tiết và côn trùng.
2. Túi ngủ - Một túi ngủ ấm áp và thoải mái sẽ giúp bạn có giấc ngủ ngon trong khi ở ngoài trời.
3. Bếp di động - Một bếp di động như bình đun hoặc bếp cồn sẽ giúp bạn nấu ăn và giữ ấm thức ăn của mình.
4. Đèn pin - Một đèn pin mạnh mẽ sẽ giúp chiếu sáng khu vực cắm trại của bạn vào ban đêm.
5. Bản đồ và la bàn - Bản đồ và la bàn sẽ giúp bạn định hướng chính xác đến địa điểm cắm trại của bạn và quay lại nhà nếu cần.
> Viết cho tôi một bài thơ về thành phố Đà Lạt
Thành phố sương mù, nơi thiên nhiên thanh bình,
Nơi những bông hoa khoe sắc và cây xanh tươi tốt,
Một ốc đảo giữa lòng đất, một xứ sở thần tiên,
Nơi mọi lo lắng đều tan biến trong không khí.
Những con đường nhỏ đan xen qua những ngọn đồi thoai thoải,
Và những ngôi nhà cổ kính, với những bức tường màu nâu ấm áp,
Nơi những người dân thân thiện chào đón bạn bằng nụ cười,
Và tiếng chim hót líu lo như lời chào của thiên nhiên.
Những dòng suối chảy êm đềm, những thác nước đổ xuống,
Và những hồ nước lấp lánh, những cảnh tượng ngoạn mục,
Nơi sự tĩnh lặng ngự trị, nơi bình yên ngự trị,
Nơi tâm hồn được chữa lành và cảm xúc được giải phóng.
Ở đây, thời gian dường như đứng yên,
Nơi những kỷ niệm vẫn còn nguyên vẹn,
Nơi những mối quan hệ bền chặt được nuôi dưỡng,
Nơi tình yêu và hòa bình chiếm vị trí hàng đầu.
Thế giới bên ngoài có vẻ xa xôi,
Nhưng ở đây, chỉ cần bước chân đi,
Nơi bầu trời rộng lớn, những đám mây cao chót vót,
Và những tia nắng vàng rực rỡ, sưởi ấm trái tim chúng ta.
Ôi, thành phố sương mù, nơi thiên nhiên thanh bình,
Nơi những giấc mơ trở thành hiện thực, nơi trái tim tràn ngập niềm vui,
Nơi tình yêu và hòa bình ngự trị, nơi tình yêu và hòa bình ngự trị.
````
Note: 120GB of pre-trained Vietnamese  data might not be enough for a general question about Vietnamese events.
## Training Details
**Hardware Type:**
* GPU: VGA NVIDIA Tesla P100 16GB
* SYSTEM RAM: 32GB
**Hours used:** ~47.5 days Approx*
### Training Data
* BactrianX
* OpenOrca_translated
* WizardLM_70k_translated
* TigerLabMathInstruct_translated_vi
* GradeSchoolMathInstruct_translated
* vilm_lima-vi
* MTEngVietnamese
* databricks_dolly15k_translated
* AlpacaCleaned_translated
* databricks_dolly15k
* OpenOrca
* GradeSchoolMathInstruct
* AlpacaCleaned
* WebglmQA
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
* Learning rate: 2e-5 cosine
* Optimizer: PagedLion8bit
* QLora: rank: 64 /Q: 4-bit
* Batch size: 2
* Gradient accumulation: 128
* Effective batch size: 256
* 250k examples of 70% Vietnamese 30% English for 3.37 epoch
* 350k examples of 60% Vietnamese 40% English for 1.4 epoch
### Training loss
Last 10 runs:
![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/rV8Go_YFZv7QcR_FhFxp-.png)
Each line is 12 hours
All run plots:
![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/bxdq_0Mu8seBJCJNmpoEb.png)
A bad way to visualize i know...
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/z1ZTm7Tab4tQbVPgQW1hU.png)
Our model currently sits at TOP-5 on the VMLU benchmark
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_1TuanPham__T-Llama)
| Metric |Value|
|---------------------------------|----:|
|Avg. |54.34|
|AI2 Reasoning Challenge (25-Shot)|54.18|
|HellaSwag (10-Shot) |76.48|
|MMLU (5-Shot) |47.98|
|TruthfulQA (0-shot) |46.47|
|Winogrande (5-shot) |71.27|
|GSM8k (5-shot) |29.64|
## Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
```bibtex
@online{t-llama,
author = {Pham Minh Tuan},
title = {T-Llama: A New Language Model for Vietnamese}, % Placeholder
year = 2024,
url = {https://github.com/vTuanpham/Vietnamese_QA_System}
}
```

5
added_tokens.json Normal file
View File

@@ -0,0 +1,5 @@
{
"####### Instruction:": 45450,
"%%%%%%% Response:": 45451,
"[PAD]": 45449
}

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"_name_or_path": "1TuanPham/InstructEnVi_llama2-bkai-120GB-Orcafamily_250kx3.37",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 2048,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 32,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.33.1",
"use_cache": true,
"vocab_size": 45452
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 0,
"transformers_version": "4.33.1"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:77cf61dfe8d7cc5ba7e45e2735158e66eb27e288441eb872a9c4165feef908cc
size 877774304

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:83e183571fa0d59c89d37380e615b26bef0580c9860433e6764261e580be0e45
size 843090056

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3dee539e982c1fe23878a9efa26cc4dc04730476991508128276be86e4d58e31
size 899713160

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2b3cf005f17798ccf911ee9a4470b9c17a8fd5e9872592a282c1f0338f0b3ef6
size 899713160

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:285356391aa683fe3d432927b79a5b665850c224c0e8e0e41e2b5251531607ad
size 899729768

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5d26de9bc26fc907f9f59763885fce1ff27b3a0b698f047727061782b593e7d5
size 876644624

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c0e2297de222273d845cd4bd80f5fc8b48bb3ad3b1cb07001a48e2acc4a5311d
size 876644624

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f8c314c40383cf307a0cb32c8083111d32899f654d7d379c72c9f21d994b9a0f
size 899713176

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8ac2533e4f261297f0f4a843174e6d7163021b076921f1897726da73939b077f
size 899713176

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8b8f7cc46f4a0f5d6182c8fd7f9006c718b7623a7904d6103fb5cbd2904b6809
size 899729792

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fb2c0435e6c06a660b561708086f2abc0d282d44dbf0b090b0effca92d091ee0
size 876644624

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b4c9f59a5806f6b3841abb0adce8708ab5c8fdd8a68bc1f2a11346da268123a1
size 876644624

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e57abc99cd940dd911748fd5f6f1ab5a944e93682a9641824e6a1afd553bae3f
size 899713176

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bde07f5d9ab19c769c3ef22831373557798f9914d08b7f4c1f5f292948558263
size 899713176

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:dab70189a88988c5876653e5241c653a1294fc9ce16349c5748e6e68779ee003
size 899738072

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9cddadf45d718b2f24f865ced4dc2bf91fc857dadb07db74b672a909b9571d34
size 372342912

View File

@@ -0,0 +1,298 @@
{
"metadata": {
"total_size": 13697228800
},
"weight_map": {
"lm_head.weight": "model-00016-of-00016.safetensors",
"model.embed_tokens.weight": "model-00001-of-00016.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00016.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00016.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.1.input_layernorm.weight": "model-00002-of-00016.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00002-of-00016.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00016.safetensors",
"model.layers.10.input_layernorm.weight": "model-00006-of-00016.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00006-of-00016.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.11.input_layernorm.weight": "model-00006-of-00016.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00006-of-00016.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.12.input_layernorm.weight": "model-00007-of-00016.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00007-of-00016.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00006-of-00016.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.13.input_layernorm.weight": "model-00007-of-00016.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00007-of-00016.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.14.input_layernorm.weight": "model-00008-of-00016.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00008-of-00016.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00007-of-00016.safetensors",
"model.layers.15.input_layernorm.weight": "model-00008-of-00016.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00008-of-00016.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.16.input_layernorm.weight": "model-00009-of-00016.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00009-of-00016.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00008-of-00016.safetensors",
"model.layers.17.input_layernorm.weight": "model-00009-of-00016.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00009-of-00016.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.18.input_layernorm.weight": "model-00010-of-00016.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00010-of-00016.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00009-of-00016.safetensors",
"model.layers.19.input_layernorm.weight": "model-00010-of-00016.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00010-of-00016.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.2.input_layernorm.weight": "model-00002-of-00016.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00002-of-00016.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.20.input_layernorm.weight": "model-00010-of-00016.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00010-of-00016.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00010-of-00016.safetensors",
"model.layers.21.input_layernorm.weight": "model-00011-of-00016.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00011-of-00016.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.22.input_layernorm.weight": "model-00011-of-00016.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00011-of-00016.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.23.input_layernorm.weight": "model-00012-of-00016.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00012-of-00016.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00011-of-00016.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.24.input_layernorm.weight": "model-00012-of-00016.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00012-of-00016.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.25.input_layernorm.weight": "model-00013-of-00016.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00013-of-00016.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00012-of-00016.safetensors",
"model.layers.26.input_layernorm.weight": "model-00013-of-00016.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00013-of-00016.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.27.input_layernorm.weight": "model-00014-of-00016.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00014-of-00016.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00013-of-00016.safetensors",
"model.layers.28.input_layernorm.weight": "model-00014-of-00016.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00014-of-00016.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.29.input_layernorm.weight": "model-00015-of-00016.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00015-of-00016.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00014-of-00016.safetensors",
"model.layers.3.input_layernorm.weight": "model-00003-of-00016.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00003-of-00016.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00002-of-00016.safetensors",
"model.layers.30.input_layernorm.weight": "model-00015-of-00016.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00015-of-00016.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.31.input_layernorm.weight": "model-00015-of-00016.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00015-of-00016.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00015-of-00016.safetensors",
"model.layers.4.input_layernorm.weight": "model-00003-of-00016.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00003-of-00016.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.5.input_layernorm.weight": "model-00004-of-00016.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00004-of-00016.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00003-of-00016.safetensors",
"model.layers.6.input_layernorm.weight": "model-00004-of-00016.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00004-of-00016.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.7.input_layernorm.weight": "model-00005-of-00016.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00005-of-00016.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00004-of-00016.safetensors",
"model.layers.8.input_layernorm.weight": "model-00005-of-00016.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00005-of-00016.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.9.input_layernorm.weight": "model-00005-of-00016.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00005-of-00016.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00005-of-00016.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00005-of-00016.safetensors",
"model.norm.weight": "model-00015-of-00016.safetensors"
}
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:47b2b0d5c9ee53a16043c2a4e27c4b9d545015f9f710f681381a5896e420d1af
size 877778068

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:db75a2caa9a5e32c045c1a23eadaa82858ea20f56e55069ca4ab1fea98b8c2b9
size 843095104

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:04958125cb2ef16e6970c0fe7c028759872154dc76cd815ea78ac964dd90cefc
size 899718208

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:995294dda453a9eb082e6a37a16d7bc2960f949b850dbc9f9c4dd3cf02895740
size 899718208

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:787ce03670b285f5eee5501826bad324fdbb072da02931e906ae5f7d04a830c8
size 899735268

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:127984e253e012bea6a577f1251312831cc025e3f3f8e316d6fa7e3b0936ca2d
size 876649874

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a0c2d2fa2eea936c1085df25d2975005a4377e2b8502141859a1fba89f0a4ca2
size 876649874

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:047b3e3beeb9f2b96bd80d1e31fa5cba0436793859a611488b2b5548f1efa923
size 899718208

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:18f77dfba72578bf01652c2038272be6fc433b2194bcd786dbd63d9381c99eb3
size 899718208

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8750d022aa8d6f359c215024d24fdea48fae7fbcf82e016e6ceb56702072c09d
size 899735268

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e635fcb39c0d31bf9ee4c945dd53405b5ba2b0c01c5e7c271b5e5b08ecdebf39
size 876649874

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:14f348259d44a4d11f9312c981d8c15d199cfb8f2bb671c8a636b5810c767833
size 876649874

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:03d10b7191c97672813edbaa03b69a167fc809deec7f868bb42b02127af1ed27
size 899718208

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:06988204bc074045ff1281cf4430554080e1c210a929c747d9c50be0e68bc1d6
size 899718208

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c9ff351a1077eb9b0d45383143ef13ab1ef922d5d6905d4fdef25e1e73f3c08a
size 899743734

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:be1989782cd2ff4eb8ce4ae6ac240e8115a33e2306c7f79d3f8085e6f75ee64b
size 372344197

View File

@@ -0,0 +1,298 @@
{
"metadata": {
"total_size": 13697228800
},
"weight_map": {
"lm_head.weight": "pytorch_model-00016-of-00016.bin",
"model.embed_tokens.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.1.input_layernorm.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00016.bin",
"model.layers.10.input_layernorm.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.input_layernorm.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.12.input_layernorm.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00006-of-00016.bin",
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.input_layernorm.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.14.input_layernorm.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00007-of-00016.bin",
"model.layers.15.input_layernorm.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.16.input_layernorm.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00008-of-00016.bin",
"model.layers.17.input_layernorm.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.18.input_layernorm.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00009-of-00016.bin",
"model.layers.19.input_layernorm.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.2.input_layernorm.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.20.input_layernorm.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00010-of-00016.bin",
"model.layers.21.input_layernorm.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.input_layernorm.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.23.input_layernorm.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00011-of-00016.bin",
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.input_layernorm.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.25.input_layernorm.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00012-of-00016.bin",
"model.layers.26.input_layernorm.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.27.input_layernorm.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00013-of-00016.bin",
"model.layers.28.input_layernorm.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.29.input_layernorm.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00014-of-00016.bin",
"model.layers.3.input_layernorm.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00002-of-00016.bin",
"model.layers.30.input_layernorm.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.input_layernorm.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00015-of-00016.bin",
"model.layers.4.input_layernorm.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.5.input_layernorm.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00003-of-00016.bin",
"model.layers.6.input_layernorm.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.7.input_layernorm.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00004-of-00016.bin",
"model.layers.8.input_layernorm.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.input_layernorm.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00005-of-00016.bin",
"model.norm.weight": "pytorch_model-00015-of-00016.bin"
}
}

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "[PAD]",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": "<unk>"
}

132470
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1363cd114e300eecf639f5e06624087f257bcbfed6820242aef9be6c8bb89de1
size 716289

48
tokenizer_config.json Normal file
View File

@@ -0,0 +1,48 @@
{
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false,
"special": true
},
"45449": {
"content": "[PAD]",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "[PAD]",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false,
"use_fast": true
}