初始化项目,由ModelHub XC社区提供模型

Model: bitext/Mistral-7B-Insurance
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-27 22:16:24 +08:00
commit ead0d445de
11 changed files with 558 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

110
README.md Normal file
View File

@@ -0,0 +1,110 @@
---
license: apache-2.0
inference: false
tags:
- generated_from_trainer
- text-generation-inference
model-index:
- name: Mistral-7B-Insurance
results: []
model_type: mistral
pipeline_tag: text-generation
widget:
- messages:
- role: user
content: I want help seeing my health insurance
---
# Mistral-7B-Insurance
## Model Description
This model, "Mistral-7B-Insurance", is a fine-tuned version of the [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), specifically tailored for the Insurance domain. It is optimized to answer questions and assist users with various Insurance-related procedures. It has been trained using hybrid synthetic data generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools.
The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. An overview of this approach can be found at: [From General-Purpose LLMs to Verticalized Enterprise Models](https://www.bitext.com/blog/general-purpose-models-verticalized-enterprise-genai/)
## Intended Use
- **Recommended applications**: This model is designed to be used as the first step in Bitexts two-step approach to LLM fine-tuning for the creation of chatbots, virtual assistants and copilots for the Insurance domain, providing customers with fast and accurate answers about their needs.
- **Out-of-scope**: This model is not suited for non-insurance related questions and should not be used for providing health, legal, or critical safety advice.
## Usage Example
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = AutoModelForCausalLM.from_pretrained("bitext/Mistral-7B-Insurance")
tokenizer = AutoTokenizer.from_pretrained("bitext/Mistral-7B-Insurance")
messages = [
{"role": "system", "content": "You are an expert in customer support for Insurance."},
{"role": "user", "content": "I want help seeing my health insurance"},
]
encoded = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = encoded.to(device)
model.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
```
## Model Architecture
This model utilizes the `MistralForCausalLM` architecture with a `LlamaTokenizer`, ensuring it retains the foundational capabilities of the base model while being specifically enhanced for insurance-related interactions.
## Training Data
The model was fine-tuned on the [Bitext Insurance Dataset](https://huggingface.co/datasets/bitext/Bitext-insurance-llm-chatbot-training-dataset) comprising various insurance-related intents, including: buy_insurance_policy, schedule_appointment, check_payments, calculate_insurance_quote, negotiate_settlement, and more. Totaling 39 intents, and each intent is represented by approximately 1000 examples.
This comprehensive training helps the model address a broad spectrum of insurance-related questions effectively. The dataset follows the same structured approach as our dataset published on Hugging Face as [bitext/Bitext-customer-support-llm-chatbot-training-dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset), but with a focus on insurance.
## Training Procedure
### Hyperparameters
- **Optimizer**: AdamW
- **Learning Rate**: 0.0002 with a cosine learning rate scheduler
- **Epochs**: 1
- **Batch Size**: 4
- **Gradient Accumulation Steps**: 4
- **Maximum Sequence Length**: 8192 tokens
### Environment
- **Transformers Version**: 4.43.4
- **Framework**: PyTorch 2.3.1+cu121
- **Tokenizers**: Tokenizers 0.19.1
## Limitations and Bias
- The model is trained for insurance-specific contexts but may underperform in unrelated areas.
- Potential biases in the training data could affect the neutrality of the responses; users are encouraged to evaluate responses critically.
## Ethical Considerations
It is important to use this technology thoughtfully, ensuring it does not substitute for human judgment where necessary, especially in sensitive situations.
## Acknowledgments
This model was developed and trained by Bitext using proprietary data and technology.
## License
This model, "Mistral-7B-Insurance", is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This open-source license allows for free use, modification, and distribution of the model but requires that proper credit be given to Bitext.
### Key Points of the Apache 2.0 License
- **Permissibility**: Users are allowed to use, modify, and distribute this software freely.
- **Attribution**: You must provide proper credit to Bitext Innovations International, Inc. when using this model, in accordance with the original copyright notices and the license.
- **Patent Grant**: The license includes a grant of patent rights from the contributors of the model.
- **No Warranty**: The model is provided "as is" without warranties of any kind.
You may view the full license text at [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0).
This licensing ensures the model can be used widely and freely while respecting the intellectual contributions of Bitext. For more detailed information or specific legal questions about using this license, please refer to the official license documentation linked above.

27
config.json Normal file
View File

@@ -0,0 +1,27 @@
{
"_name_or_path": "mistralai/Mistral-7B-Instruct-v0.2",
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.44.0",
"use_cache": false,
"vocab_size": 32000
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"do_sample": true,
"eos_token_id": 2,
"transformers_version": "4.44.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f03377de15c647f265e637a0233180a117907b0ce1403bcab4d7ad80ce3c407c
size 4943185632

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:78ad20e4e9226273db19069d119eff8f7defc7da48005798edbf64db6c57e563
size 4999844744

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4e0178a4f8a705c6b5f6c0a9b634f22bf5e7e902192ac44b87bb2807df3d8b9d
size 4540537414

View File

@@ -0,0 +1,298 @@
{
"metadata": {
"total_size": 14483464192
},
"weight_map": {
"lm_head.weight": "pytorch_model-00003-of-00003.bin",
"model.embed_tokens.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.11.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.20.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.input_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00002-of-00003.bin",
"model.layers.23.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.30.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.input_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00003-of-00003.bin",
"model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.input_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00001-of-00003.bin",
"model.norm.weight": "pytorch_model-00003-of-00003.bin"
}
}

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "</s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

45
tokenizer_config.json Normal file
View File

@@ -0,0 +1,45 @@
{
"add_bos_token": true,
"add_eos_token": false,
"add_prefix_space": null,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [],
"bos_token": "<s>",
"chat_template": "{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] %}\n {%- set loop_messages = messages[1:] %}\n{%- else %}\n {%- set loop_messages = messages %}\n{%- endif %}\n\n{{- bos_token }}\n{%- for message in loop_messages %}\n {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}\n {%- endif %}\n {%- if message['role'] == 'user' %}\n {%- if loop.first and system_message is defined %}\n {{- ' [INST] ' + system_message + '\\n\\n' + message['content'] + ' [/INST]' }}\n {%- else %}\n {{- ' [INST] ' + message['content'] + ' [/INST]' }}\n {%- endif %}\n {%- elif message['role'] == 'assistant' %}\n {{- ' ' + message['content'] + eos_token}}\n {%- else %}\n {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n {%- endif %}\n{%- endfor %}\n",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "</s>",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false,
"use_fast": true
}