初始化项目，由ModelHub XC社区提供模型

Model: Jarvis1111/DoctorAgent-RL-SFT-1k-Thinking Source: Original Platform
2026-05-18 18:22:27 +08:00
commit 6e3d00f79a
18 changed files with 152229 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,36 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,128 @@
 ---
 base_model:
 - Qwen/Qwen2.5-7B-Instruct
 language:
 - en
 license: apache-2.0
 pipeline_tag: text-generation
 tags:
 - medical
 library_name: transformers
 paper: "2505.19630"
 ---
 # DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue
 [![arXiv](https://img.shields.io/badge/arXiv-2505.19630-b31b1b.svg)](https://huggingface.co/papers/2505.19630) [![GitHub](https://img.shields.io/badge/GitHub-Code-blue.svg?logo=github)](https://github.com/JarvisUSTC/DoctorAgent-RL) [![Hugging Face Collection](https://img.shields.io/badge/Hugging%20Face%20Collection-doctoragent--rl-blue)](https://huggingface.co/collections/Jarvis1111/doctoragent-rl-684ffbcade52305ba0e3e97f)
 <div align="center">
  <img width="1231" alt="DoctorAgent-RL Overview" src="https://github.com/user-attachments/assets/bd9f676e-01f9-406c-881d-c2b9f45e62f3" />
 </div>
 DoctorAgent-RL is a novel reinforcement learning (RL)-based multi-agent collaborative framework that models medical consultations as a dynamic decision-making process under uncertainty. It addresses core challenges faced by LLMs in real-world clinical consultations, such as vague diagnoses from single-round systems and the inflexibility of traditional multi-turn dialogue models constrained by static supervised learning.
 In DoctorAgent-RL, a doctor agent continuously optimizes its questioning strategy within an RL framework through multi-turn interactions with a patient agent. This dynamic adjustment of information-gathering paths is guided by comprehensive rewards from a Consultation Evaluator. This RL fine-tuning mechanism enables LLMs to autonomously develop interaction strategies aligned with clinical reasoning logic, moving beyond superficial imitation of patterns in existing dialogue data. The work also introduces MTMedDialog, the first English multi-turn medical consultation dataset capable of simulating patient interactions.
 Experiments demonstrate that DoctorAgent-RL outperforms existing models in both multi-turn reasoning capability and final diagnostic performance, showing immense practical value in reducing misdiagnosis risks and optimizing medical resource allocation.
 ## Key Features
 *   **Multi-Agent Collaboration**: Features distinct Doctor and Patient agents with specific roles and objectives.
 *   **Dynamic Strategy Optimization**: Leverages reinforcement learning for continuous policy updates and adaptive dialogue behavior.
 *   **Comprehensive Reward Design**: Guides optimal strategies through multi-dimensional consultation evaluation metrics.
 *   **Medical Knowledge Integration**: Embeds clinical reasoning logic directly into decision-making processes.
 *   **MTMedDialog Dataset**: Introduces the first English multi-turn medical consultation dataset designed for simulation capabilities.
 ## Methodology
 <div align="center">
  <img src="https://github.com/JarvisUSTC/DoctorAgent-RL/blob/main/Figures/framework.png?raw=true" alt="System Architecture" width="600">
 </div>
 The DoctorAgent-RL framework comprises three core interacting components: a **Doctor Agent** for diagnostic reasoning and question formulation, a **Patient Agent** simulating patient responses, and a **Consultation Evaluator** providing multi-dimensional reward signals to assess consultation quality. This continuous learning loop refines interaction strategies through iterative interactions and policy updates.
 ## How to Use
 This model is built on the `Qwen/Qwen2.5-7B-Instruct` base model and is designed to be compatible with the Hugging Face `transformers` library.
 To use the DoctorAgent-RL model for multi-turn clinical dialogue, you can load it as follows:
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 # Load the model and tokenizer
 model_name = "Jarvis1111/DoctorAgent-RL" 
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16, # Use appropriate dtype (e.g., torch.float16 or torch.float32)
    device_map="auto" # Automatically maps the model to available devices (e.g., GPU)
 )
 # Function to generate response based on conversation history
 def get_doctor_response(conversation_history):
    # Apply the chat template to format the conversation
    text = tokenizer.apply_chat_template(
        conversation_history,
        tokenize=False,
        add_generation_prompt=True
    )
    inputs = tokenizer(text, return_tensors="pt").to(model.device)
    # Generate the response
    generated_ids = model.generate(
        **inputs,
        max_new_tokens=512, # Maximum length of the generated response
        do_sample=True,
        temperature=0.7,    # Controls creativity (higher = more creative)
        top_k=20,           # Considers top-k most likely next tokens
        top_p=0.8,          # Filters tokens by cumulative probability
        pad_token_id=tokenizer.pad_token_id, # Use tokenizer's pad token id (151643 for <|endoftext|>)
        eos_token_id=[tokenizer.eos_token_id, tokenizer.pad_token_id] # Both <|im_end|> (151645) and <|endoftext|> (151643)
    )
    # Decode the generated tokens
    # Remove the input tokens to get only the new response
    generated_ids = generated_ids[0, inputs.input_ids.shape[1]:]
    response = tokenizer.decode(generated_ids, skip_special_tokens=True)
    return response
 # Example multi-turn clinical dialogue
 conversation = []
 # Turn 1: Patient describes symptoms
 patient_input_1 = "I have a persistent cough and a sore throat. It started about three days ago."
 conversation.append({"role": "user", "content": patient_input_1})
 print(f"Patient: {patient_input_1}")
 doctor_response_1 = get_doctor_response(conversation)
 conversation.append({"role": "assistant", "content": doctor_response_1})
 print(f"Doctor: {doctor_response_1}")
 # Turn 2: Patient responds to doctor's follow-up
 patient_input_2 = "Yes, I also feel quite fatigued and have a mild headache, especially behind my eyes."
 conversation.append({"role": "user", "content": patient_input_2})
 print(f"Patient: {patient_input_2}")
 doctor_response_2 = get_doctor_response(conversation)
 conversation.append({"role": "assistant", "content": doctor_response_2})
 print(f"Doctor: {doctor_response_2}")
 # Continue the conversation as needed to reach a diagnosis or provide advice.
 ```
 For more detailed setup instructions, training scripts, and experimentation, please refer to the [official GitHub repository](https://github.com/JarvisUSTC/DoctorAgent-RL).
 ## Citation
 If DoctorAgent-RL contributes to your research, please consider citing our work:
 ```bibtex
@article{feng2025doctoragent,
  title={DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue},
  author={Feng, Yichun and Wang, Jiawei and Zhou, Lu and Li, Yixue},
  journal={arXiv preprint arXiv:2505.19630},
  year={2025}
 }
 ```
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,24 @@
 {
  "</tool_call>": 151658,
  "<tool_call>": 151657,
  "<|box_end|>": 151649,
  "<|box_start|>": 151648,
  "<|endoftext|>": 151643,
  "<|file_sep|>": 151664,
  "<|fim_middle|>": 151660,
  "<|fim_pad|>": 151662,
  "<|fim_prefix|>": 151659,
  "<|fim_suffix|>": 151661,
  "<|im_end|>": 151645,
  "<|im_start|>": 151644,
  "<|image_pad|>": 151655,
  "<|object_ref_end|>": 151647,
  "<|object_ref_start|>": 151646,
  "<|quad_end|>": 151651,
  "<|quad_start|>": 151650,
  "<|repo_name|>": 151663,
  "<|video_pad|>": 151656,
  "<|vision_end|>": 151653,
  "<|vision_pad|>": 151654,
  "<|vision_start|>": 151652
 }
--- a/config.json
+++ b/config.json
@@ -0,0 +1,29 @@
 {
  "_name_or_path": "Qwen2.5-7B-Instruct",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "float32",
  "transformers_version": "4.47.1",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 152064
 }
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,14 @@
 {
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.7,
  "top_k": 20,
  "top_p": 0.8,
  "transformers_version": "4.47.1"
 }
--- a/merges.txt
+++ b/merges.txt
--- a/model-00001-of-00007.safetensors
+++ b/model-00001-of-00007.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:3d6d58753585d276a20482f87e4bcb7ddad76a813ccfc3747cc30ea44e8c8cdf
 size 4976687216
--- a/model-00002-of-00007.safetensors
+++ b/model-00002-of-00007.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4eee6e9c01e709319d8a50a003498cc819840b5d10ffbfff33568ab481640972
 size 4778622352
--- a/model-00003-of-00007.safetensors
+++ b/model-00003-of-00007.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4bd2d25670829aee6ee61639756e79e2dd2c71fdcf85c00fcbe5c47ac6adaea1
 size 4932743960
--- a/model-00004-of-00007.safetensors
+++ b/model-00004-of-00007.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:34095cea956fc0893027c6589493944eb1863776364fec3ebf1d1df372ad84a4
 size 4932743992
--- a/model-00005-of-00007.safetensors
+++ b/model-00005-of-00007.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:78c20193c0bf1ca01b45f46d7d56a514903020170155c6bdcb67efaca7bcf7f8
 size 4998852296
--- a/model-00006-of-00007.safetensors
+++ b/model-00006-of-00007.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:a23bb3534e66c70751bd53dbe01220070aa7ea14b19e3ab61b71c12142018271
 size 3662865184
--- a/model-00007-of-00007.safetensors
+++ b/model-00007-of-00007.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:154114701324787157d446b6f495a7760000ee7f2816e708d75284ad6b745d12
 size 2179989632
--- a/model.safetensors.index.json
+++ b/model.safetensors.index.json
@@ -0,0 +1,346 @@
 {
  "metadata": {
    "total_size": 30462466048
  },
  "weight_map": {
    "lm_head.weight": "model-00007-of-00007.safetensors",
    "model.embed_tokens.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.input_layernorm.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.self_attn.k_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.self_attn.q_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.0.self_attn.v_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.input_layernorm.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.self_attn.k_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.self_attn.q_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.1.self_attn.v_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.10.input_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.mlp.down_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.mlp.gate_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.mlp.up_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.post_attention_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.self_attn.k_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.10.self_attn.k_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.self_attn.o_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.self_attn.q_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.10.self_attn.q_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.10.self_attn.v_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.10.self_attn.v_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.input_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.mlp.down_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.mlp.gate_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.mlp.up_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.post_attention_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.self_attn.k_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.11.self_attn.k_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.self_attn.o_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.self_attn.q_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.11.self_attn.q_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.11.self_attn.v_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.11.self_attn.v_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.input_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.mlp.down_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.mlp.gate_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.mlp.up_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.post_attention_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.self_attn.k_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.12.self_attn.k_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.self_attn.o_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.self_attn.q_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.12.self_attn.q_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.12.self_attn.v_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.12.self_attn.v_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.13.input_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.13.mlp.down_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.13.mlp.gate_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.13.mlp.up_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.13.post_attention_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.13.self_attn.k_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.13.self_attn.k_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.13.self_attn.o_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.13.self_attn.q_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.13.self_attn.q_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.13.self_attn.v_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.13.self_attn.v_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.14.input_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.mlp.down_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.mlp.gate_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.mlp.up_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.post_attention_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.self_attn.k_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.14.self_attn.k_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.self_attn.o_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.self_attn.q_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.14.self_attn.q_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.14.self_attn.v_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.14.self_attn.v_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.input_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.mlp.down_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.mlp.gate_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.mlp.up_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.post_attention_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.self_attn.k_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.15.self_attn.k_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.self_attn.o_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.self_attn.q_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.15.self_attn.q_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.15.self_attn.v_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.15.self_attn.v_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.input_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.mlp.down_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.mlp.gate_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.mlp.up_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.post_attention_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.self_attn.k_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.16.self_attn.k_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.self_attn.o_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.self_attn.q_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.16.self_attn.q_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.16.self_attn.v_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.16.self_attn.v_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.input_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.mlp.down_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.mlp.gate_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.mlp.up_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.post_attention_layernorm.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.self_attn.k_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.17.self_attn.k_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.self_attn.o_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.self_attn.q_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.17.self_attn.q_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.17.self_attn.v_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.17.self_attn.v_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.18.input_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.18.mlp.down_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.18.mlp.gate_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.18.mlp.up_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.18.post_attention_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.18.self_attn.k_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.18.self_attn.k_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.18.self_attn.o_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.18.self_attn.q_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.18.self_attn.q_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.18.self_attn.v_proj.bias": "model-00004-of-00007.safetensors",
    "model.layers.18.self_attn.v_proj.weight": "model-00004-of-00007.safetensors",
    "model.layers.19.input_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.mlp.down_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.mlp.gate_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.mlp.up_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.post_attention_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.self_attn.k_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.19.self_attn.k_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.self_attn.o_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.self_attn.q_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.19.self_attn.q_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.19.self_attn.v_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.19.self_attn.v_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.2.input_layernorm.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.self_attn.k_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.self_attn.q_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.2.self_attn.v_proj.bias": "model-00001-of-00007.safetensors",
    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00007.safetensors",
    "model.layers.20.input_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.mlp.down_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.mlp.gate_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.mlp.up_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.post_attention_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.self_attn.k_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.20.self_attn.k_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.self_attn.o_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.self_attn.q_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.20.self_attn.q_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.20.self_attn.v_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.20.self_attn.v_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.input_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.mlp.down_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.mlp.gate_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.mlp.up_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.post_attention_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.self_attn.k_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.21.self_attn.k_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.self_attn.o_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.self_attn.q_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.21.self_attn.q_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.21.self_attn.v_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.21.self_attn.v_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.input_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.mlp.down_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.mlp.gate_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.mlp.up_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.post_attention_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.self_attn.k_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.22.self_attn.k_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.self_attn.o_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.self_attn.q_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.22.self_attn.q_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.22.self_attn.v_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.22.self_attn.v_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.input_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.mlp.down_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.mlp.gate_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.mlp.up_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.post_attention_layernorm.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.self_attn.k_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.23.self_attn.k_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.self_attn.o_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.self_attn.q_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.23.self_attn.q_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.23.self_attn.v_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.23.self_attn.v_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.24.input_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.24.mlp.down_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.24.mlp.gate_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.24.mlp.up_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.24.post_attention_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.24.self_attn.k_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.24.self_attn.k_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.24.self_attn.o_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.24.self_attn.q_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.24.self_attn.q_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.24.self_attn.v_proj.bias": "model-00005-of-00007.safetensors",
    "model.layers.24.self_attn.v_proj.weight": "model-00005-of-00007.safetensors",
    "model.layers.25.input_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.mlp.down_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.mlp.gate_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.mlp.up_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.post_attention_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.self_attn.k_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.25.self_attn.k_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.self_attn.o_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.self_attn.q_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.25.self_attn.q_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.25.self_attn.v_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.25.self_attn.v_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.input_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.mlp.down_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.mlp.gate_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.mlp.up_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.post_attention_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.self_attn.k_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.26.self_attn.k_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.self_attn.o_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.self_attn.q_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.26.self_attn.q_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.26.self_attn.v_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.26.self_attn.v_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.input_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.mlp.down_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.mlp.gate_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.mlp.up_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.post_attention_layernorm.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.self_attn.k_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.27.self_attn.k_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.self_attn.o_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.self_attn.q_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.27.self_attn.q_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.27.self_attn.v_proj.bias": "model-00006-of-00007.safetensors",
    "model.layers.27.self_attn.v_proj.weight": "model-00006-of-00007.safetensors",
    "model.layers.3.input_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.mlp.down_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.mlp.gate_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.mlp.up_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.post_attention_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.self_attn.k_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.3.self_attn.k_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.self_attn.o_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.self_attn.q_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.3.self_attn.q_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.3.self_attn.v_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.3.self_attn.v_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.input_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.mlp.down_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.mlp.gate_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.mlp.up_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.post_attention_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.self_attn.k_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.4.self_attn.k_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.self_attn.o_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.self_attn.q_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.4.self_attn.q_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.4.self_attn.v_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.4.self_attn.v_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.input_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.mlp.down_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.mlp.gate_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.mlp.up_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.post_attention_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.self_attn.k_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.5.self_attn.k_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.self_attn.o_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.self_attn.q_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.5.self_attn.q_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.5.self_attn.v_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.5.self_attn.v_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.input_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.mlp.down_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.mlp.gate_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.mlp.up_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.post_attention_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.self_attn.k_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.6.self_attn.k_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.self_attn.o_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.self_attn.q_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.6.self_attn.q_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.6.self_attn.v_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.6.self_attn.v_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.input_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.mlp.down_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.mlp.gate_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.mlp.up_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.self_attn.k_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.7.self_attn.k_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.self_attn.o_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.self_attn.q_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.7.self_attn.q_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.7.self_attn.v_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.7.self_attn.v_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.8.input_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.8.mlp.down_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.8.mlp.gate_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.8.mlp.up_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.8.post_attention_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.8.self_attn.k_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.8.self_attn.k_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.8.self_attn.q_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.8.self_attn.q_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.8.self_attn.v_proj.bias": "model-00002-of-00007.safetensors",
    "model.layers.8.self_attn.v_proj.weight": "model-00002-of-00007.safetensors",
    "model.layers.9.input_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.mlp.down_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.mlp.gate_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.mlp.up_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.post_attention_layernorm.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.self_attn.k_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.9.self_attn.k_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.self_attn.o_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.self_attn.q_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.9.self_attn.q_proj.weight": "model-00003-of-00007.safetensors",
    "model.layers.9.self_attn.v_proj.bias": "model-00003-of-00007.safetensors",
    "model.layers.9.self_attn.v_proj.weight": "model-00003-of-00007.safetensors",
    "model.norm.weight": "model-00006-of-00007.safetensors"
  }
 }
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,31 @@
 {
  "additional_special_tokens": [
    "<|im_start|>",
    "<|im_end|>",
    "<|object_ref_start|>",
    "<|object_ref_end|>",
    "<|box_start|>",
    "<|box_end|>",
    "<|quad_start|>",
    "<|quad_end|>",
    "<|vision_start|>",
    "<|vision_end|>",
    "<|vision_pad|>",
    "<|image_pad|>",
    "<|video_pad|>"
  ],
  "eos_token": {
    "content": "<|im_end|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": {
    "content": "<|endoftext|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa
 size 11421896
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,208 @@
 {
  "add_bos_token": false,
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "151643": {
      "content": "<|endoftext|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151644": {
      "content": "<|im_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151645": {
      "content": "<|im_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151646": {
      "content": "<|object_ref_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151647": {
      "content": "<|object_ref_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151648": {
      "content": "<|box_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151649": {
      "content": "<|box_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151650": {
      "content": "<|quad_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151651": {
      "content": "<|quad_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151652": {
      "content": "<|vision_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151653": {
      "content": "<|vision_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151654": {
      "content": "<|vision_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151655": {
      "content": "<|image_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151656": {
      "content": "<|video_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151657": {
      "content": "<tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151658": {
      "content": "</tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151659": {
      "content": "<|fim_prefix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151660": {
      "content": "<|fim_middle|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151661": {
      "content": "<|fim_suffix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151662": {
      "content": "<|fim_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151663": {
      "content": "<|repo_name|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151664": {
      "content": "<|file_sep|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    }
  },
  "additional_special_tokens": [
    "<|im_start|>",
    "<|im_end|>",
    "<|object_ref_start|>",
    "<|object_ref_end|>",
    "<|box_start|>",
    "<|box_end|>",
    "<|quad_start|>",
    "<|quad_end|>",
    "<|vision_start|>",
    "<|vision_end|>",
    "<|vision_pad|>",
    "<|image_pad|>",
    "<|video_pad|>"
  ],
  "bos_token": null,
  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "errors": "replace",
  "extra_special_tokens": {},
  "model_max_length": 131072,
  "pad_token": "<|endoftext|>",
  "split_special_tokens": false,
  "tokenizer_class": "Qwen2Tokenizer",
  "unk_token": null
 }
--- a/vocab.json
+++ b/vocab.json