初始化项目，由ModelHub XC社区提供模型

Model: abhinav0231/Lily-1.5b-v0.3 Source: Original Platform
2026-05-29 03:34:18 +08:00
commit 6fc1d43eab
12 changed files with 152189 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,36 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,396 @@
 ---
 license: apache-2.0
 language:
  - en
 pipeline_tag: text-generation
 tags:
  - qwen2
  - causal-lm
  - instruction-tuned
  - distillation
  - sft
  - lora
  - qlora
  - unsloth
  - chatml
  - reasoning
 base_model: abhinav0231/Lily-1.5b-v0.1
 datasets:
  - abhinav0231/Sarvam-105b-Distill-100k
 ---
 # Lily-1.5b-v0.3
 Lily-1.5b-v0.3 is a distilled instruction-tuned language model built by continuing training from `abhinav0231/Lily-1.5b-v0.1` on the `abhinav0231/Sarvam-105b-Distill-100k` dataset using the `chatml` split/configuration.
 This version was trained as an offline supervised fine-tuning run focused on high-quality long-form assistant responses in ChatML format, with many examples following an explicit `<think>` and `<answer>` structure.
 The model was trained and merged in a single-GPU Modal workflow on an NVIDIA A100-SXM4-40GB system using BF16, QLoRA, and Unsloth.
 ---
 # Model summary
 This checkpoint starts from `abhinav0231/Lily-1.5b-v0.1` and applies a distillation-style supervised fine-tuning stage rather than training from scratch.
 The base architecture loaded during training is a Qwen2-style causal language model with:
 - 28 layers
 - hidden size 1536
 - 12 attention heads
 - 2 key-value heads
 - vocabulary size 151,936
 The training setup targets:
 - instruction following
 - structured response generation
 - distilled reasoning-flavored outputs
 rather than pure base-model continuation pretraining.
 ---
 # Training objective
 The goal of v0.3 was to improve the model through offline SFT distillation from a synthetic/teacher-style dataset while preserving the usability and compact size of the 1.5B-class base model.
 The dataset examples are preformatted as ChatML conversations and frequently instruct the assistant to reason in a `<think>` block before producing a final `<answer>` block.
 Because of that training distribution, the model may naturally produce more structured, tutor-like, stepwise outputs than the earlier checkpoint depending on the prompt style.
 ---
 # Base model
 - **Base model:** `abhinav0231/Lily-1.5b-v0.1`
 - **Final merged model repo:** `abhinav0231/Lily-1.5b-v0.3`
 - **GGUF Repo** `abhinav0231/Lily-1.5b-v0.3-GGUF`
 ---
 # Benchmarks
 Evaluation setup using `lm-evaluation-harness`, v0.3 achieved:
 ![image](https://cdn-uploads.huggingface.co/production/uploads/6617f58b9e7b3dbb1407d934/c6KDfKHIiEPmcwZi5qbBe.png)
 ---
 # Dataset
 The main training dataset is:
 `abhinav0231/Sarvam-105b-Distill-100k`
 using the `chatml` configuration, stored as a single `text` column of preformatted conversations.
 The final training notebook loaded:
 - 91,457 training examples
 - 1,908 validation examples
 A separate sanity-check pass over the dataset family showed a very similar distribution, including:
 - 92,040 training examples
 - 1,917 validation examples
 - 1,918 test examples
 confirming the same overall ChatML reasoning-style format.
 ---
 ## Dataset style
 The dataset uses ChatML with:
 - `<|im_start|>`
 - `<|im_end|>`
 delimiters and includes a chat template in the tokenizer setup.
 Many examples use a system prompt that explicitly asks the assistant to think through the problem in a `<think>` block and then give the final response in an `<answer>` block.
 This means the model was not trained on plain raw instruction-response text alone; it was trained on a formatted conversational distribution with strong structural priors.
 ---
 ## Length characteristics
 A 5,000-sample sanity slice of the training set had:
 - mean length = 1640.72 tokens
 - p50 = 1219
 - p90 = 3221
 - p95 = 4096.15
 - p99 = 6883.35
 About:
 - 5.00% of sampled training examples
 - 4.33% of sampled validation examples
 exceeded 4096 tokens.
 These numbers matter because the training run used a 4096 token max sequence length, so the longest examples are subject to truncation or packing effects depending on preprocessing behavior.
 ---
 # Training setup
 Training was run on a single NVIDIA A100-SXM4-40GB GPU in Modal, without:
 - DDP
 - `accelerate launch`
 - multi-process orchestration
 The environment used:
 - Unsloth 2026.5.2
 - TRL 0.22.2
 - PyTorch 2.8.0+cu129
 - CUDA 12.9
 - Triton 3.4.0
 - BF16 mixed precision
 Flash Attention 2 was auto-enabled by Unsloth because the A100 supports it.
 ---
 ## Core hyperparameters
 | Parameter | Value |
 |---|---|
 | Max sequence length | 4096 |
 | Num epochs | 2 |
 | Learning rate | 2e-5 |
 | Warmup steps | 100 |
 | Warmup ratio | 0.03 |
 | Batch size | 24 |
 | Gradient accumulation | 1 |
 | Effective batch size | 24 |
 | Seed | 42 |
 ---
 ## Optimization stack
 The model was loaded with QLoRA 4-bit weights during training, while the final merged checkpoint was saved in 16-bit merged form for deployment and inference use.
 The W&B config logged the optimizer as `adamw_8bit`, while the trainer config used fused AdamW (`adamw_torch_fused`) in the notebook training arguments.
 Sequence packing was enabled, dataset preprocessing used multiprocessing, and periodic evaluation/checkpoint saving was configured during the run.
 ---
 # LoRA / PEFT details
 The fine-tuning used:
 - LoRA rank = 32
 - LoRA alpha = 64
 Target modules:
 - `q_proj`
 - `k_proj`
 - `v_proj`
 - `o_proj`
 - `gate_proj`
 - `up_proj`
 - `down_proj`
 The run reported approximately:
 - 36.9M trainable parameters
 which corresponded to around 2.34%–4.0% of total parameters depending on counting conventions.
 ---
 # Hardware and runtime
 Training hardware:
 - NVIDIA A100-SXM4-40GB
 - ~42.4 GB VRAM exposed
 - Compute capability 8.0
 - BF16 support
 - Flash Attention 2 support
 The run specifically targeted A100-native BF16 and Flash Attention 2 optimizations.
 Total training runtime was approximately:
 - 5 hours 14 minutes
 ---
 # Checkpointing and merge
 Intermediate checkpoints were pushed to:
 `abhinav0231/Lily-1.5b-distill-v3-checkpoints`
 during training.
 The workflow included auto-resume logic from the latest Hugging Face checkpoint.
 After training, the LoRA adapter was merged back into the base model in BF16/16-bit form and pushed as:
 `abhinav0231/Lily-1.5b-v0.3`
 The notebook also included GGUF export paths for quantized deployment variants.
 ---
 # Training logs
 The trainer log reported:
 - 33,297 packed training examples
 - 2 epochs
 - 2,776 optimization steps
 Validation loss decreased from:
 - 9.100862 at step 500
 to
 - 8.973075 at step 2500
 These values should be interpreted as internal training diagnostics rather than direct end-user quality metrics.
 ---
 # Intended use
 This model is intended for:
 - instruction-following chat experiments
 - structured answer generation
 - research on distilled reasoning-style outputs
 - lightweight local or hosted inference in the 1.5B parameter class
 It is especially suited to prompts where:
 - a user asks for explanations or breakdowns
 - the desired answer format is structured
 - the prompt resembles the ChatML style used during training
 ---
 # Prompting notes
 Because the training data is ChatML-formatted, best results usually come from chat-style prompting rather than plain raw completion prompting.
 The model may respond in a more verbose tutor-like style because many training prompts encouraged detailed reasoning followed by a final answer.
 If a cleaner direct-answer style is preferred, using a concise system prompt and explicitly requesting short outputs can help steer generation.
 ---
 # Limitations
 This model was trained on synthetic/distilled instruction data rather than broad raw web-scale pretraining data.
 As a result:
 - outputs may reflect teacher-style formatting biases
 - responses may become over-structured
 - reasoning markup may occasionally appear in generations
 The dataset sanity checks also flagged formatting irregularities in sampled rows, including repeated markers and malformed counts, so downstream behavior may inherit some formatting artifacts from the source corpus.
 ---
 # Safety
 This model is not designed for fully autonomous use in high-stakes domains such as:
 - legal
 - medical
 - financial
 - safety-critical systems
 Outputs can still be:
 - incorrect
 - incomplete
 - overconfident
 Human review is recommended for consequential use cases.
 ---
 # Usage
 ## Transformers
 ```python
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
 model_id = "abhinav0231/Lily-1.5b-v0.3"
 tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True,
 )
 model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
 )
 messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain overfitting in simple terms."},
 ]
 text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
 )
 inputs = tokenizer(text, return_tensors="pt").to(model.device)
 with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
    )
 print(tokenizer.decode(outputs, skip_special_tokens=True))
 ```
 ### Suggested prompting
 For best results:
 - use chat-style prompts,
 - keep instructions explicit,
 - specify desired format,
 - request concise output if you do not want long reasoning-style responses.
 ## Provenance
 - **Base model:** `abhinav0231/Lily-1.5b-v0.1`
 - **Training dataset:** `abhinav0231/Sarvam-105b-Distill-100k` (`chatml`)
 - **Training framework:** Unsloth + TRL
 - **Hardware:** 1x NVIDIA A100-SXM4-40GB
 - **Final merged repo:** `abhinav0231/Lily-1.5b-v0.3`
 ## Acknowledgements
 This model was trained with Unsloth, Hugging Face Transformers, TRL, PEFT/LoRA-style fine-tuning, and W&B logging in a Modal-hosted workflow.
 This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,26 @@
 {
  "</tool_call>": 151658,
  "<<|PAD_TOKEN|>>": 151666,
  "<tool_call>": 151657,
  "<|PAD_TOKEN|>": 151665,
  "<|box_end|>": 151649,
  "<|box_start|>": 151648,
  "<|endoftext|>": 151643,
  "<|file_sep|>": 151664,
  "<|fim_middle|>": 151660,
  "<|fim_pad|>": 151662,
  "<|fim_prefix|>": 151659,
  "<|fim_suffix|>": 151661,
  "<|im_end|>": 151645,
  "<|im_start|>": 151644,
  "<|image_pad|>": 151655,
  "<|object_ref_end|>": 151647,
  "<|object_ref_start|>": 151646,
  "<|quad_end|>": 151651,
  "<|quad_start|>": 151650,
  "<|repo_name|>": 151663,
  "<|video_pad|>": 151656,
  "<|vision_end|>": 151653,
  "<|vision_pad|>": 151654,
  "<|vision_start|>": 151652
 }
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,54 @@
 {%- if tools %}
    {{- '<|im_start|>system\n' }}
    {%- if messages[0]['role'] == 'system' %}
        {{- messages[0]['content'] }}
    {%- else %}
        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
    {%- endif %}
    {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
    {%- for tool in tools %}
        {{- "\n" }}
        {{- tool | tojson }}
    {%- endfor %}
    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
 {%- else %}
    {%- if messages[0]['role'] == 'system' %}
        {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
    {%- else %}
        {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
    {%- endif %}
 {%- endif %}
 {%- for message in messages %}
    {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
    {%- elif message.role == "assistant" %}
        {{- '<|im_start|>' + message.role }}
        {%- if message.content %}
            {{- '\n' + message.content }}
        {%- endif %}
        {%- for tool_call in message.tool_calls %}
            {%- if tool_call.function is defined %}
                {%- set tool_call = tool_call.function %}
            {%- endif %}
            {{- '\n<tool_call>\n{"name": "' }}
            {{- tool_call.name }}
            {{- '", "arguments": ' }}
            {{- tool_call.arguments | tojson }}
            {{- '}\n</tool_call>' }}
        {%- endfor %}
        {{- '<|im_end|>\n' }}
    {%- elif message.role == "tool" %}
        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
            {{- '<|im_start|>user' }}
        {%- endif %}
        {{- '\n<tool_response>\n' }}
        {{- message.content }}
        {{- '\n</tool_response>' }}
        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
            {{- '<|im_end|>\n' }}
        {%- endif %}
    {%- endif %}
 {%- endfor %}
 {%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n' }}
 {%- endif %}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,62 @@
 {
    "architectures": [
        "Qwen2ForCausalLM"
    ],
    "attention_dropout": 0.0,
    "bos_token_id": null,
    "torch_dtype": "bfloat16",
    "eos_token_id": 151645,
    "hidden_act": "silu",
    "hidden_size": 1536,
    "initializer_range": 0.02,
    "intermediate_size": 8960,
    "layer_types": [
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention",
        "full_attention"
    ],
    "max_position_embeddings": 32768,
    "max_window_layers": 21,
    "model_type": "qwen2",
    "num_attention_heads": 12,
    "num_hidden_layers": 28,
    "num_key_value_heads": 2,
    "pad_token_id": 151665,
    "rms_norm_eps": 1e-06,
    "rope_parameters": {
        "rope_theta": 1000000.0,
        "rope_type": "default"
    },
    "sliding_window": null,
    "tie_word_embeddings": true,
    "unsloth_fixed": true,
    "unsloth_version": "2026.5.2",
    "use_cache": true,
    "use_sliding_window": false,
    "vocab_size": 151936
 }
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,8 @@
 {
  "_from_model_config": true,
  "eos_token_id": 151645,
  "max_length": 32768,
  "pad_token_id": 151665,
  "transformers_version": "5.5.0",
  "use_cache": true
 }
--- a/merges.txt
+++ b/merges.txt
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:d90d5c82a09b5f17d5e8670d734b036d347d498b67ee8555d3fb647c5323ac52
 size 3087466808
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,10 @@
 {
  "eos_token": {
    "content": "<|im_end|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": "<<|PAD_TOKEN|>>"
 }
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:bd5948af71b4f56cf697f7580814c7ce8b80595ef985544efcacf716126a2e31
 size 11422356
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,202 @@
 {
  "add_prefix_space": false,
  "backend": "tokenizers",
  "bos_token": null,
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "errors": "replace",
  "is_local": false,
  "model_max_length": 32768,
  "pad_token": "<|PAD_TOKEN|>",
  "padding_side": "right",
  "split_special_tokens": false,
  "tokenizer_class": "Qwen2Tokenizer",
  "unk_token": null,
  "added_tokens_decoder": {
    "151643": {
      "content": "<|endoftext|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151644": {
      "content": "<|im_start|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151645": {
      "content": "<|im_end|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151646": {
      "content": "<|object_ref_start|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151647": {
      "content": "<|object_ref_end|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151648": {
      "content": "<|box_start|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151649": {
      "content": "<|box_end|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151650": {
      "content": "<|quad_start|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151651": {
      "content": "<|quad_end|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151652": {
      "content": "<|vision_start|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151653": {
      "content": "<|vision_end|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151654": {
      "content": "<|vision_pad|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151655": {
      "content": "<|image_pad|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151656": {
      "content": "<|video_pad|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    "151657": {
      "content": "<tool_call>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151658": {
      "content": "</tool_call>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151659": {
      "content": "<|fim_prefix|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151660": {
      "content": "<|fim_middle|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151661": {
      "content": "<|fim_suffix|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151662": {
      "content": "<|fim_pad|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151663": {
      "content": "<|repo_name|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151664": {
      "content": "<|file_sep|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    "151665": {
      "content": "<|PAD_TOKEN|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    }
  },
  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n"
 }
--- a/vocab.json
+++ b/vocab.json