初始化项目,由ModelHub XC社区提供模型
Model: prithivMLmods/Llama-3.2-3B-Promptist-Mini Source: Original Platform
This commit is contained in:
36
.gitattributes
vendored
Normal file
36
.gitattributes
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
73
README.md
Normal file
73
README.md
Normal file
@@ -0,0 +1,73 @@
|
||||
---
|
||||
license: creativeml-openrail-m
|
||||
datasets:
|
||||
- prithivMLmods/Prompt-Enhancement-Mini
|
||||
language:
|
||||
- en
|
||||
base_model:
|
||||
- meta-llama/Llama-3.2-3B-Instruct
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- Ollama
|
||||
- trl
|
||||
- safetensors
|
||||
- pytorch
|
||||
- long_prompts
|
||||
- prompt_enhancement
|
||||
- short_prompt
|
||||
- Prompts
|
||||
- Image-Generation-Long-Prompt
|
||||
---
|
||||
### **Llama-3.2-3B-Promptist-Mini Model Files**
|
||||
|
||||
The **Llama-3.2-3B-Promptist-Mini** is a fine-tuned version of the **Llama-3.2-3B-Instruct** model, specifically optimized for prompt engineering and enhancement tasks. It is ideal for generating and enhancing various types of text prompts, offering high performance in creative and instructional applications. The model leverages a smaller, more efficient architecture suited for fine-tuning and prompt-based use cases.
|
||||
|
||||
| File Name | Size | Description | Upload Status |
|
||||
|----------------------------------------|------------|------------------------------------------------|----------------|
|
||||
| `.gitattributes` | 1.52 kB | Git attributes configuration file | Uploaded |
|
||||
| `README.md` | 287 Bytes | Updated README file | Updated |
|
||||
| `config.json` | 940 Bytes | Model configuration settings | Uploaded |
|
||||
| `generation_config.json` | 162 Bytes | Generation-specific configurations | Uploaded |
|
||||
| `merges.txt` | 515 kB | Merging information for tokenization | Uploaded |
|
||||
| `pytorch_model.bin` | 3.42 GB | Full model weights (PyTorch format) | Uploaded (LFS) |
|
||||
| `special_tokens_map.json` | 572 Bytes | Mapping for special tokens used by the model | Uploaded |
|
||||
| `tokenizer.json` | 3.77 MB | Tokenizer configuration and vocabulary | Uploaded |
|
||||
| `tokenizer_config.json` | 3.95 kB | Tokenizer configuration for loading and usage | Uploaded |
|
||||
| `vocab.json` | 801 kB | Vocabulary for the tokenizer | Uploaded |
|
||||
|
||||

|
||||
|
||||
### **Key Features:**
|
||||
|
||||
1. **Prompt Engineering and Enhancement:**
|
||||
This model is fine-tuned to generate and improve prompts for various applications, such as question generation, creative writing, and instruction-following tasks.
|
||||
|
||||
2. **Text Generation:**
|
||||
It excels in generating coherent and contextually relevant text based on the given prompts. The model can be used for a wide range of text-based applications, including content creation and automated text generation.
|
||||
|
||||
3. **Custom Tokenizer:**
|
||||
Includes a tokenizer optimized for handling specialized tokens related to prompt-based tasks, ensuring the model performs well in generating creative and logical text.
|
||||
|
||||
---
|
||||
|
||||
### **Training Details:**
|
||||
- **Base Model:** [Llama-3.2-3B-Instruct](#)
|
||||
- **Dataset:** Trained on **Prompt-Enhancement-Mini**, a dataset specifically designed to enhance prompt generation, with examples tailored to creative and instructional contexts.
|
||||
|
||||
---
|
||||
|
||||
### **Capabilities:**
|
||||
- **Prompt Generation and Enhancement:**
|
||||
The model can generate and enhance prompts for various tasks, including machine learning, creative writing, and instructional content.
|
||||
|
||||
- **Text Generation:**
|
||||
It excels at generating coherent, structured, and contextually appropriate text from user inputs, making it suitable for a wide variety of text-based applications.
|
||||
|
||||
---
|
||||
|
||||
### **Usage Instructions:**
|
||||
1. **Model Setup:** Download all model files and ensure the PyTorch model weights and tokenizer configurations are included.
|
||||
2. **Inference:** Load the model in a Python environment using frameworks like PyTorch or Hugging Face's Transformers.
|
||||
3. **Customization:** Configure the model with the `config.json` and `generation_config.json` files for optimal performance during inference.
|
||||
|
||||
---
|
||||
42
config.json
Normal file
42
config.json
Normal file
@@ -0,0 +1,42 @@
|
||||
{
|
||||
"_name_or_path": "unsloth/llama-3.2-3b-instruct-bnb-4bit",
|
||||
"architectures": [
|
||||
"LlamaForCausalLM"
|
||||
],
|
||||
"attention_bias": false,
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 128000,
|
||||
"eos_token_id": [
|
||||
128001,
|
||||
128008,
|
||||
128009
|
||||
],
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 3072,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 8192,
|
||||
"max_position_embeddings": 131072,
|
||||
"mlp_bias": false,
|
||||
"model_type": "llama",
|
||||
"num_attention_heads": 24,
|
||||
"num_hidden_layers": 28,
|
||||
"num_key_value_heads": 8,
|
||||
"pad_token_id": 128004,
|
||||
"pretraining_tp": 1,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_scaling": {
|
||||
"factor": 32.0,
|
||||
"high_freq_factor": 4.0,
|
||||
"low_freq_factor": 1.0,
|
||||
"original_max_position_embeddings": 8192,
|
||||
"rope_type": "llama3"
|
||||
},
|
||||
"rope_theta": 500000.0,
|
||||
"tie_word_embeddings": true,
|
||||
"torch_dtype": "float16",
|
||||
"transformers_version": "4.46.2",
|
||||
"unsloth_version": "2024.12.1",
|
||||
"use_cache": true,
|
||||
"vocab_size": 128256
|
||||
}
|
||||
1
configuration.json
Normal file
1
configuration.json
Normal file
@@ -0,0 +1 @@
|
||||
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}
|
||||
14
generation_config.json
Normal file
14
generation_config.json
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"bos_token_id": 128000,
|
||||
"do_sample": true,
|
||||
"eos_token_id": [
|
||||
128001,
|
||||
128008,
|
||||
128009
|
||||
],
|
||||
"max_length": 131072,
|
||||
"pad_token_id": 128004,
|
||||
"temperature": 0.6,
|
||||
"top_p": 0.9,
|
||||
"transformers_version": "4.46.2"
|
||||
}
|
||||
3
pytorch_model-00001-of-00002.safetensors
Normal file
3
pytorch_model-00001-of-00002.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:16d9373fd3860b13b8b8d3eb97061fd0b85892b8597a323937c76f724af5afa3
|
||||
size 4965798912
|
||||
3
pytorch_model-00002-of-00002.safetensors
Normal file
3
pytorch_model-00002-of-00002.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:db68803b07b35792ccce6012d6249989c3d9586bf866d5a370be87444e879e45
|
||||
size 1459729880
|
||||
261
pytorch_model.bin.index.json
Normal file
261
pytorch_model.bin.index.json
Normal file
@@ -0,0 +1,261 @@
|
||||
{
|
||||
"metadata": {
|
||||
"total_size": 6425499648
|
||||
},
|
||||
"weight_map": {
|
||||
"model.embed_tokens.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.20.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.21.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
|
||||
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
|
||||
"model.norm.weight": "pytorch_model-00002-of-00002.bin"
|
||||
}
|
||||
}
|
||||
23
special_tokens_map.json
Normal file
23
special_tokens_map.json
Normal file
@@ -0,0 +1,23 @@
|
||||
{
|
||||
"bos_token": {
|
||||
"content": "<|begin_of_text|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"eos_token": {
|
||||
"content": "<|eot_id|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": {
|
||||
"content": "<|finetune_right_pad_id|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
Binary file not shown.
2064
tokenizer_config.json
Normal file
2064
tokenizer_config.json
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user