初始化项目,由ModelHub XC社区提供模型

Model: gaotang/RM-R1-Qwen2.5-Instruct-14B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-11 21:46:55 +08:00
commit 28b431ad70
17 changed files with 152498 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

163
README.md Normal file
View File

@@ -0,0 +1,163 @@
---
base_model:
- Qwen/Qwen2.5-14B-Instruct
language:
- en
license: mit
pipeline_tag: text-generation
library_name: transformers
---
![image/png](https://cdn-uploads.huggingface.co/production/uploads/654d784d71a30c4bca09a319/Q7MVJfIHDerQ24c1zwZwK.png)
<font size=3><div align='center' >
[[**🤗 Model & Dataset**](https://huggingface.co/collections/gaotang/rm-r1-681128cdab932701cad844c8)]
[[**📊 Code**](https://github.com/RM-R1-UIUC/RM-R1)]
[[**📖 Paper**](https://arxiv.org/abs/2505.02387)]
</div></font>
# 🚀 Can we cast reward modeling as a reasoning task?
**RM-R1** is a training framework for *Reasoning Reward Model* (ReasRM) that judges two candidate answers by first **thinking out loud**—generating structured rubrics or reasoning traces—then emitting its preference. Compared to traditional scalar or generative reward models, RM-R1 delivers **state-of-the-art performance** on public RM benchmarks on average while offering fully interpretable justifications.
## 🧠 TL;DR
* **Two-stage training**
1. **Distillation** of ~8.7 K high-quality reasoning traces (Chain-of-Rubrics).
2. **Reinforcement Learning with Verifiable Rewards** (RLVR) on ~64 K preference pairs.
* **Backbones** released: 7 B / 14 B / 32 B Qwen-2.5-Instruct variants + DeepSeek-distilled checkpoints.
## 💡 Intended uses
* **RLHF / RLAIF**: plug-and-play reward function for policy optimisation.
* **Automated evaluation**: LLM-as-a-judge for open-domain QA, chat, and reasoning.
* **Research**: study process supervision, chain-of-thought verification, or rubric generation.
## 🔍 Demo Code
Try the model with this example. Full demo notebook available at:
📎 [Official Demo Link](https://github.com/RM-R1-UIUC/RM-R1/blob/main/demo/demo.ipynb)
### 🧾 Prompt Template
```python
INSTRUCT_SYSTEM_PROMPT = (
"Please act as an impartial judge and evaluate the quality of the responses provided by two AI Chatbots to the Client's question displayed below.\n\n"
"First, classify the task into one of two categories: <type>Reasoning</type> or <type>Chat</type>.\n"
"- Use <type>Reasoning</type> for tasks that involve math, coding, or require domain knowledge, multi-step inference, logical deduction, or combining information to reach a conclusion.\n"
"- Use <type>Chat</type> for tasks that involve open-ended or factual conversation, stylistic rewrites, safety questions, or general helpfulness requests without deep reasoning.\n\n"
"If the task is Reasoning:\n"
"1. Solve the Client's question yourself and present your final answer within <solution>...</solution> tags.\n"
"2. Evaluate the two Chatbot responses based on correctness, completeness, and reasoning quality, referencing your own solution.\n"
"3. Include your evaluation inside <eval>...</eval> tags, quoting or summarizing the Chatbots using the following tags:\n"
" - <quote_A>...</quote_A> for direct quotes from Chatbot A\n"
" - <summary_A>...</summary_A> for paraphrases of Chatbot A\n"
" - <quote_B>...</quote_B> for direct quotes from Chatbot B\n"
" - <summary_B>...</summary_B> for paraphrases of Chatbot B\n"
"4. End with your final judgment in the format: <answer>[[A]]</answer> or <answer>[[B]]</answer>\n\n"
"If the task is Chat:\n"
"1. Generate evaluation criteria (rubric) tailored to the Client's question and context, enclosed in <rubric>...</rubric> tags.\n"
"2. Assign weights to each rubric item based on their relative importance.\n"
"3. Inside <rubric>, include a <justify>...</justify> section explaining why you chose those rubric criteria and weights.\n"
"4. Compare both Chatbot responses according to the rubric.\n"
"5. Provide your evaluation inside <eval>...</eval> tags, using <quote_A>, <summary_A>, <quote_B>, and <summary_B> as described above.\n"
"6. End with your final judgment in the format: <answer>[[A]]</answer> or <answer>[[B]]</answer>\n\n"
"Important Notes:\n"
"- Be objective and base your evaluation only on the content of the responses.\n"
"- Do not let response order, length, or Chatbot names affect your judgment.\n"
"- Follow the response format strictly depending on the task type.\n\n"
"Your output must follow one of the two formats below:\n\n"
"For Reasoning:\n"
"<type>Reasoning</type>\n\n"
"<solution> your own solution for the problem </solution>\n\n"
"<eval>\n"
" include direct comparisons supported by <quote_A>...</quote_A> or <summary_A>...</summary_A>, and <quote_B>...</quote_B>, or <summary_B>...</summary_B>\n"
"</eval>\n\n"
"<answer>[[A/B]]</answer>\n\n"
"For Chat:\n"
"<type>Chat</type>\n\n"
"<rubric>\n"
" detailed rubric items\n"
" <justify> justification for the rubric </justify>\n"
"</rubric>\n\n"
"<eval>\n"
" include direct comparisons supported by <quote_A>...</quote_A> or <summary_A>...</summary_A>, and <quote_B>...</quote_B>, or <summary_B>...</summary_B> tags\n"
"</eval>\n\n"
"<answer>[[A/B]]</answer>"
)
INSTRUCT_SINGLE_USER_PROMPT_TEMPLATE = (
"[Client Question]\n{question}\n\n[The Start of Chatbot A's Response]\n{answer_a}\n[The End of Chatbot A's Response]\n\n"
"[The Start of Chatbot B's Response]\n{answer_b}\n[The End of Chatbot B's Response]"
)
```
### 📦 Inference Code
```python
# Load the Model
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "gaotang/RM-R1-Qwen2.5-Instruct-14B"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto" # or specify the specific device map if needed
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Single Turn Example - from Reward Bench
prompt = "How do I detail a car?"
answer_a = "Detailing a car involves a thorough cleaning inside and out, as well as polishing and waxing to protect the vehicle's surfaces. Here's a step-by-step guide to detailing a car:\n\n**Exterior Detailing:**\n\n1. **Wash the Car:**\n - Rinse the car with water to remove loose dirt.\n - Use a car wash soap and microfiber wash mitt to clean the car from top to bottom.\n - Clean the wheels and tires with a brush and a wheel cleaner.\n - Rinse the car thoroughly to remove all soap.\n\n2. **Dry the Car:**\n - Use a microfiber towel or a chamois to dry the car to prevent water spots.\n\n3. **Clay Bar Treatment:**\n - Use a clay bar with a lubricant to remove embedded surface contaminants from the paint.\n\n4. **Polishing:**\n - Apply car polish with a dual-action polisher or by hand to correct paint imperfections and create a smooth surface.\n\n5. **Waxing:**\n - Apply a coat of wax or paint sealant to protect the paint and give it a glossy finish.\n\n6. **Windows and Mirrors:**\n - Clean the windows and mirrors with a glass cleaner and a microfiber towel.\n\n7. **Tire and Trim Dressing:**\n - Apply a tire dressing to the tires for a shiny finish.\n - Use a trim restorer or protectant on plastic and rubber parts to prevent fading.\n\n**Interior Detailing:**\n\n1. **Remove Trash:**\n - Clear out any trash and remove personal items from the car.\n\n2. **Vacuum:**\n - Vacuum the seats, carpets, floor mats, and trunk.\n - Use a brush attachment for the dashboard and door panels.\n\n3. **Shampoo Carpets and Upholstery:**\n - Use a carpet cleaner and a brush to clean the carpets and upholstery.\n - For leather interiors, use a leather cleaner and conditioner.\n\n4. **Clean Hard Surfaces:**\n - Wipe down all hard surfaces (dashboard, center console, door panels, etc.) with a mild all-purpose cleaner and a microfiber cloth.\n\n5. **Windows and Mirrors:**\n - Clean the interior side of windows and mirrors.\n\n6. **Air Vents and Crevices:**\n - Use a detailing brush or compressed air to clean out air vents and hard-to-reach crevices.\n\n7. **Final Touches:**\n - Apply a protectant to the dashboard and other plastic components.\n - Replace air fresheners if needed.\n\n**Additional Tips:**\n\n- Work in the shade or a cool, well-ventilated garage to prevent products from drying too quickly and leaving residue.\n- Use separate buckets for washing and rinsing to avoid contaminating the clean water with dirt.\n- Always use gentle, non-abrasive materials and cleaners specifically designed for automotive use to avoid damaging surfaces.\n- Move in a systematic way to ensure you don't miss any spots.\n\nBy following these steps, you'll give your car a thorough clean that not only makes it look great but also helps in maintaining its value. Remember, regular detailing can prevent wear and tear and keep your car looking new for years to come." # Accepted
answer_b = "Detailing a car involves washing the exterior and interior of the car, as well as polishing and waxing the exterior. Interior detailing typically involves vacuuming, cleaning the upholstery and air vents, polishing the dashboard and console, and dusting. Polishing and waxing the exterior will depend on the condition of the paint, but typically involves applying a polish and wax to make it shine." # Rejected
user_prompt_single = INSTRUCT_SINGLE_USER_PROMPT_TEMPLATE.format(
question=prompt,
answer_a=answer_a,
answer_b=answer_b
)
conversation = [
{"role":"system", "content": INSTRUCT_SYSTEM_PROMPT},
{"role":"user", "content": user_prompt_single}
]
input_ids = tokenizer.apply_chat_template(
conversation,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt").to(model.device)
generation = model.generate(
input_ids=input_ids,
max_new_tokens=8192, # For optimal performance benchmarking, please set this to unlimited (e.g., 50000)
do_sample=False,
)
completion = tokenizer.decode(
generation[0][len(input_ids[0]):],
skip_special_tokens=True,
clean_up_tokenization_spaces=True
)
print(completion)
```
## Citations
```bibtex
@article{chen2025rm,
title={RM-R1: Reward Modeling as Reasoning},
author={Chen, Xiusi and Li, Gaotang and Wang, Ziqi and Jin, Bowen and Qian, Cheng and Wang, Yu and Wang, Hongru and Zhang, Yu and Zhang, Denghui and Zhang, Tong and others},
journal={arXiv preprint arXiv:2505.02387},
year={2025}
}
```

24
added_tokens.json Normal file
View File

@@ -0,0 +1,24 @@
{
"</tool_call>": 151658,
"<tool_call>": 151657,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"_name_or_path": "/mnt/home/ziqi/Rubric-RM/checkpoints/rubric_rm/rubric_rm_qwen2.5_14B_LR7.0e-7_filtered_sky_code_8k_math_10k_rubric_evidence_classify_weight_4k8k_distilled_Claude_o3_0419_ShuffleDataset/global_step_62/actor/huggingface",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13824,
"max_position_embeddings": 32768,
"max_window_layers": 70,
"model_type": "qwen2",
"num_attention_heads": 40,
"num_hidden_layers": 48,
"num_key_value_heads": 8,
"pad_token_id": 151643,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": false,
"use_sliding_window": false,
"vocab_size": 152064
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"eos_token_id": 151645,
"pad_token_id": 151643,
"transformers_version": "4.49.0",
"use_cache": false
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5b620c2b69df16ce7bba9b150eedfa9e079160e0f8bd791cd9044271e4666dd9
size 4954772280

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:19638af6c3bea2cda4330609f89390f3326fe508700d775930d2e99c6ce0d862
size 4891892712

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:26bff8ce37878b29a4594faf7bea23899d73cece452044c1139d41ca4ac927eb
size 4996761624

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1c6f4bf9597a29ea55ff2cc33f199dbc8ff022e3f4d16a8881478ce4c9fc0b9d
size 4949620552

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6ca156b2e14ff7c951fd689295e66c019d6aacc4de5e0287c94f840f91849c4a
size 4928474192

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:84f4436665943be8c924e0aeeb6fc3609f07f2afe165d0d3856e84f336750f78
size 4818612576

View File

@@ -0,0 +1,586 @@
{
"metadata": {
"total_size": 29540067328
},
"weight_map": {
"lm_head.weight": "model-00005-of-00006.safetensors",
"model.embed_tokens.weight": "model-00001-of-00006.safetensors",
"model.layers.0.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.0.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.0.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.0.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.1.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.1.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.1.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.10.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.10.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.10.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.10.self_attn.v_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
"model.layers.11.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.11.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.11.self_attn.v_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.12.input_layernorm.weight": "model-00005-of-00006.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.12.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.12.self_attn.q_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.12.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.13.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.13.self_attn.k_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.13.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.13.self_attn.v_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.14.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.14.self_attn.k_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.14.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.14.self_attn.v_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.15.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.15.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.15.self_attn.q_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.15.self_attn.v_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.16.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.16.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.16.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.17.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.17.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.17.self_attn.q_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.17.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.18.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.18.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.18.self_attn.q_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.18.self_attn.v_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.19.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.19.self_attn.k_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.19.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.19.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.2.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.2.self_attn.k_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.2.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.2.self_attn.v_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.20.input_layernorm.weight": "model-00005-of-00006.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.20.self_attn.k_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.20.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.20.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.21.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.21.self_attn.k_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.21.self_attn.q_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.21.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.22.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.22.self_attn.k_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.22.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.22.self_attn.v_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.23.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.23.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.23.self_attn.q_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.23.self_attn.v_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.24.input_layernorm.weight": "model-00005-of-00006.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.24.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.24.self_attn.q_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.24.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.25.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.25.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.25.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.25.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.26.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.26.self_attn.k_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.26.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.26.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.27.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.27.self_attn.k_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.27.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.27.self_attn.v_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.28.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.28.self_attn.k_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.28.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.28.self_attn.v_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.29.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.29.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.29.self_attn.q_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.29.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.3.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.3.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.3.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.3.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.30.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.30.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.30.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.30.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.31.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
"model.layers.31.self_attn.k_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.31.self_attn.q_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.31.self_attn.v_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.32.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.32.self_attn.k_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.32.self_attn.q_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.32.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.33.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.33.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.33.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.33.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.34.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.34.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.34.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.34.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.35.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.35.self_attn.k_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.35.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.35.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.36.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.36.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.36.self_attn.q_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.36.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.37.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.37.self_attn.k_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.37.self_attn.q_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.37.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.38.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.38.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.38.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.38.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.39.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.39.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.39.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.39.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.4.input_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
"model.layers.4.self_attn.k_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.4.self_attn.q_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.4.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.40.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.40.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.40.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.40.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.40.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.40.self_attn.k_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.40.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.40.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.40.self_attn.q_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.40.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.40.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.40.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.41.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.41.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.41.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.41.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.41.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.41.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.41.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.41.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.41.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.41.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.41.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.41.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.42.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.42.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.42.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.42.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.42.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.42.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.42.self_attn.k_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.42.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.42.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.42.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.42.self_attn.v_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.42.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.43.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.43.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.43.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.43.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.43.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.43.self_attn.k_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.43.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.43.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.43.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.43.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.43.self_attn.v_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.43.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.44.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.44.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.44.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.44.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.44.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.44.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.44.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.44.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.44.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.44.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.44.self_attn.v_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.44.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.45.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.45.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.45.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.45.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.45.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.45.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.45.self_attn.k_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.45.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.45.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.45.self_attn.q_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.45.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.45.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.46.input_layernorm.weight": "model-00002-of-00006.safetensors",
"model.layers.46.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.46.mlp.gate_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.46.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.46.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.46.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.46.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.46.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.46.self_attn.q_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.46.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.46.self_attn.v_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.46.self_attn.v_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.47.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.47.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.47.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.47.mlp.up_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.47.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.47.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.47.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.47.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.47.self_attn.q_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.47.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.47.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.47.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.5.input_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
"model.layers.5.self_attn.k_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.5.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.5.self_attn.v_proj.bias": "model-00004-of-00006.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.6.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.6.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.6.self_attn.v_proj.bias": "model-00005-of-00006.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
"model.layers.7.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.7.self_attn.q_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.7.self_attn.v_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.8.input_layernorm.weight": "model-00005-of-00006.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"model.layers.8.self_attn.k_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.8.self_attn.q_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.8.self_attn.v_proj.bias": "model-00003-of-00006.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00002-of-00006.safetensors",
"model.layers.9.input_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00006.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
"model.layers.9.self_attn.k_proj.bias": "model-00002-of-00006.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00006.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
"model.layers.9.self_attn.q_proj.bias": "model-00006-of-00006.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"model.layers.9.self_attn.v_proj.bias": "model-00001-of-00006.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"model.norm.weight": "model-00005-of-00006.safetensors"
}
}

31
special_tokens_map.json Normal file
View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bb27d51a5fa5caa8502d091726ff7f63ada64f766ff94afe49fde7d3faba216f
size 11421996

212
tokenizer_config.json Normal file
View File

@@ -0,0 +1,212 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"max_length": 12288,
"model_max_length": 131072,
"pad_token": "<|endoftext|>",
"split_special_tokens": false,
"stride": 0,
"tokenizer_class": "Qwen2Tokenizer",
"truncation_side": "right",
"truncation_strategy": "longest_first",
"unk_token": null
}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long