初始化项目，由ModelHub XC社区提供模型

Model: metaresearch/PapersRAG-1.5B Source: Original Platform
2026-05-16 18:44:58 +08:00
commit 60c1651765
22 changed files with 531708 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,36 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 papersrag_index.faiss filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,120 @@
 ---
 license: apache-2.0
 language:
 - en
 tags:
 - rag
 - question-answering
 - scientific-literature
 - arxiv
 - nlp
 - research-tool
 pipeline_tag: text-generation
 base_model:
 - Qwen/Qwen2.5-1.5B
 ---
 # PapersRAG-1.5B 🧪
 **A retrieval-augmented generation system for querying recent scientific literature — continuously updated.**
 PapersRAG-1.5B helps researchers explore and answer questions across a growing corpus of recent NLP papers from arXiv. It pairs a lightweight language model with a curated knowledge base of paper abstracts and a retrieval pipeline that prioritizes faithful, citation-backed answers over hallucination.
 The model is **automatically refreshed every day** with the latest `cs.CL` papers. The knowledge base expands on its own. No manual upkeep required.
 ---
 ## Model description
 - **Type:** Retrieval-augmented generation (RAG)
 - **Base language model:** Qwen 2.5 1.5B — small, fast, coherent when grounded with good context
 - **Knowledge base:** A continuously growing collection of abstracts from the most recent `cs.CL` papers on arXiv, updated daily via an automated pipeline
 - **Retrieval pipeline:** Dense embeddings for initial candidate retrieval, cross-encoder for re-ranking — only the most relevant chunks reach the language model
 - **Answer style:** Every answer cites the paper title it draws from. If no relevant paper is found, the model says so instead of fabricating one
 ---
 ## Intended use
 PapersRAG is a **research assistant**. It helps scientists and students locate information within indexed NLP papers, ask comparative questions like *"What are the latest trends in retrieval-augmented generation?"*, and surface specific details about a paper's methodology or findings.
 It is not a general-purpose chatbot. It does not have access to full paper text. It only knows what has been explicitly indexed. It will tell you when it doesn't know something.
 ---
 ## How it works
 1. **Indexing** — Paper abstracts are split into overlapping chunks, embedded with a dense bi-encoder, and stored in a FAISS index
 2. **Retrieval** — The bi-encoder fetches a pool of candidate chunks for any given question
 3. **Re-ranking** — A cross-encoder scores each candidate; only chunks above a confidence threshold are kept
 4. **Generation** — Retained chunks are passed as context to the 1.5B model, which generates a cited answer
 5. **Safety** — If nothing clears the confidence threshold, the model refuses to answer rather than hallucinate
 No relevant chunk, no answer. That's the rule.
 ---
 ## Automated daily updates
 Every day, the update pipeline:
 - Downloads the existing index and chunk store from this repository
 - Scrapes the 100 most recent papers from `cs.CL` on arXiv
 - Chunks, embeds, and appends the new papers to the existing knowledge base
 - Rebuilds the FAISS index and uploads everything back
 The knowledge base grows by roughly **100 papers per day**, automatically.
 ---
 ## Quick start
 ```python
 from huggingface_hub import snapshot_download
 from pipeline import PapersRAG
 model_dir = snapshot_download("metaresearch/PapersRAG-1.5B")
 rag = PapersRAG(model_dir)
 print(rag.ask("What are the latest approaches to retrieval-augmented generation?"))
 ```
 Requires `transformers`, `sentence-transformers`, and `faiss`. Everything else is in `pipeline.py`.
 ---
 ## Model composition
 | Component | Description |
 |---|---|
 | **Language Model** | Qwen 2.5 1.5B (float16) |
 | **Bi-encoder** | Dense embedding model for initial retrieval |
 | **Cross-encoder** | Re-ranking model that scores chunks for relevance |
 | **Vector Index** | FAISS index of embedded paper chunks |
 | **Knowledge Chunks** | Processed snippets from indexed arXiv abstracts |
 | **Pipeline** | `pipeline.py` — one class, handles loading, retrieval, and generation |
 Exact model names for the bi-encoder and cross-encoder are in the repository's configuration files.
 ---
 ## Limitations
 **Knowledge base scope.** Only `cs.CL` papers from arXiv. Papers from other fields are not included unless manually added.
 **Abstracts only.** Full paper text is not indexed. Deep methodological comparisons may be incomplete.
 **Small language model.** 1.5B parameters is lightweight. The retrieval pipeline handles factual accuracy well, but nuanced multi-paper synthesis has limits.
 **English only.**
 ---
 ## License
 Apache-2.0.
 ---
 *PapersRAG is part of the Meta Research initiative — building open tools that accelerate scientific discovery.*
--- a/chunks.txt
+++ b/chunks.txt
--- a/config.json
+++ b/config.json
@@ -0,0 +1,27 @@
 {
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 1536,
  "initializer_range": 0.02,
  "intermediate_size": 8960,
  "max_position_embeddings": 32768,
  "max_window_layers": 21,
  "model_type": "qwen2",
  "num_attention_heads": 12,
  "num_hidden_layers": 28,
  "num_key_value_heads": 2,
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": true,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.43.1",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151936
 }
--- a/cross_encoder_model/README.md
+++ b/cross_encoder_model/README.md
@@ -0,0 +1,147 @@
 ---
 tags:
 - sentence-transformers
 - cross-encoder
 - reranker
 base_model: cross-encoder/ms-marco-MiniLM-L12-v2
 pipeline_tag: text-ranking
 library_name: sentence-transformers
 ---
 # CrossEncoder based on cross-encoder/ms-marco-MiniLM-L12-v2
 This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/ms-marco-MiniLM-L12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L12-v2) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
 ## Model Details
 ### Model Description
 - **Model Type:** Cross Encoder
 - **Base model:** [cross-encoder/ms-marco-MiniLM-L12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L12-v2) <!-- at revision 7b0235231ca2674cb8ca8f022859a6eba2b1c968 -->
 - **Maximum Sequence Length:** 512 tokens
 - **Number of Output Labels:** 1 label
 - **Supported Modality:** Text
 <!-- - **Training Dataset:** Unknown -->
 <!-- - **Language:** Unknown -->
 <!-- - **License:** Unknown -->
 ### Model Sources
 - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
 - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
 - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
 - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
 ### Full Model Architecture
 ```
 CrossEncoder(
  (0): Transformer({'transformer_task': 'sequence-classification', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'logits'}}, 'module_output_name': 'scores', 'architecture': 'BertForSequenceClassification'})
 )
 ```
 ## Usage
 ### Direct Usage (Sentence Transformers)
 First install the Sentence Transformers library:
 ```bash
 pip install -U sentence-transformers
 ```
 Then you can load this model and run inference.
 ```python
 from sentence_transformers import CrossEncoder
 # Download from the 🤗 Hub
 model = CrossEncoder("cross_encoder_model_id")
 # Get scores for pairs of inputs
 pairs = [
    ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
    ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
    ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
 ]
 scores = model.predict(pairs)
 print(scores)
 # [ 9.6793 -2.1906  1.9515]
 # Or rank different texts based on similarity to a single text
 ranks = model.rank(
    'How many calories in an egg',
    [
        'There are on average between 55 and 80 calories in an egg depending on its size.',
        'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
        'Most of the calories in an egg come from the yellow yolk in the center.',
    ]
 )
 # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
 ```
 <!--
 ### Direct Usage (Transformers)
 <details><summary>Click to see the direct usage in Transformers</summary>
 </details>
 -->
 <!--
 ### Downstream Usage (Sentence Transformers)
 You can finetune this model on your own dataset.
 <details><summary>Click to expand</summary>
 </details>
 -->
 <!--
 ### Out-of-Scope Use
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
 <!--
 ## Bias, Risks and Limitations
 *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
 -->
 <!--
 ### Recommendations
 *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
 -->
 ## Training Details
 ### Framework Versions
 - Python: 3.12.13
 - Sentence Transformers: 5.4.1
 - Transformers: 5.0.0
 - PyTorch: 2.10.0+cu128
 - Accelerate: 1.13.0
 - Datasets: 4.0.0
 - Tokenizers: 0.22.2
 ## Citation
 ### BibTeX
 <!--
 ## Glossary
 *Clearly define terms in order to be accessible across audiences.*
 -->
 <!--
 ## Model Card Authors
 *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
 -->
 <!--
 ## Model Card Contact
 *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
 -->
--- a/cross_encoder_model/config.json
+++ b/cross_encoder_model/config.json
@@ -0,0 +1,36 @@
 {
  "add_cross_attention": false,
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": null,
  "classifier_dropout": null,
  "dtype": "float32",
  "eos_token_id": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 384,
  "id2label": {
    "0": "LABEL_0"
  },
  "initializer_range": 0.02,
  "intermediate_size": 1536,
  "is_decoder": false,
  "label2id": {
    "LABEL_0": 0
  },
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "tie_word_embeddings": true,
  "transformers_version": "5.0.0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
 }
--- a/cross_encoder_model/config_sentence_transformers.json
+++ b/cross_encoder_model/config_sentence_transformers.json
@@ -0,0 +1,11 @@
 {
  "__version__": {
    "pytorch": "2.10.0+cu128",
    "sentence_transformers": "5.4.1",
    "transformers": "5.0.0"
  },
  "activation_fn": "torch.nn.modules.linear.Identity",
  "default_prompt_name": null,
  "model_type": "CrossEncoder",
  "prompts": {}
 }
--- a/cross_encoder_model/model.safetensors
+++ b/cross_encoder_model/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c6f930d12f0fead9acd03891e24e395903d80c1f7e505c10c6db2d5fb6a79b3b
 size 133464812
--- a/cross_encoder_model/modules.json
+++ b/cross_encoder_model/modules.json
@@ -0,0 +1,8 @@
 [
  {
    "idx": 0,
    "name": "0",
    "path": "",
    "type": "sentence_transformers.base.modules.transformer.Transformer"
  }
 ]
--- a/cross_encoder_model/sentence_bert_config.json
+++ b/cross_encoder_model/sentence_bert_config.json
@@ -0,0 +1,10 @@
 {
    "transformer_task": "sequence-classification",
    "modality_config": {
        "text": {
            "method": "forward",
            "method_output_name": "logits"
        }
    },
    "module_output_name": "scores"
 }
--- a/cross_encoder_model/tokenizer.json
+++ b/cross_encoder_model/tokenizer.json
--- a/cross_encoder_model/tokenizer_config.json
+++ b/cross_encoder_model/tokenizer_config.json
@@ -0,0 +1,18 @@
 {
  "backend": "tokenizers",
  "clean_up_tokenization_spaces": true,
  "cls_token": "[CLS]",
  "do_basic_tokenize": true,
  "do_lower_case": true,
  "is_local": false,
  "mask_token": "[MASK]",
  "model_max_length": 512,
  "model_specific_special_tokens": {},
  "never_split": null,
  "pad_token": "[PAD]",
  "sep_token": "[SEP]",
  "strip_accents": null,
  "tokenize_chinese_chars": true,
  "tokenizer_class": "BertTokenizer",
  "unk_token": "[UNK]"
 }
--- a/embeddings.npy
+++ b/embeddings.npy
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:eb5b791603f4a4ce52627bd27ccb969cc455d16849b8fe5a7cd67d8af0d353e9
 size 36034688
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,9 @@
 {
  "do_sample": true,
  "eos_token_id": 151645,
  "max_new_tokens": 256,
  "pad_token_id": 151645,
  "temperature": 0.7,
  "transformers_version": "5.0.0",
  "use_cache": true
 }
--- a/merges.txt
+++ b/merges.txt
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:dd924a11b4c220f385b51ffa522daea7c9f3d850e31b162bb5661df483c6d3ee
 size 3087467144
--- a/papersrag_index.faiss
+++ b/papersrag_index.faiss
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:cca4215e363a0f11217e5afa30ce1b69ab9f5269c5032abaacb1430c3227cc85
 size 36034605
--- a/pipeline.py
+++ b/pipeline.py
@@ -0,0 +1,58 @@
 import json, torch, numpy as np
 from sentence_transformers import SentenceTransformer, CrossEncoder
 import faiss
 from transformers import AutoTokenizer, AutoModelForCausalLM
 class PapersRAG:
    def __init__(self, model_dir="."):
        with open(f"{model_dir}/rag_config.json") as f:
            config = json.load(f)
        self.embedder = SentenceTransformer(config["embedder_model"])
        self.index = faiss.read_index(f"{model_dir}/papersrag_index.faiss")
        with open(f"{model_dir}/chunks.txt", "r", encoding="utf-8") as f:
            raw = f.read().split("<|CHUNK_END|>")
        self.chunks = [c.strip() for c in raw if c.strip()]
        self.reranker = CrossEncoder(f"{model_dir}/cross_encoder_model")
        self.tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
        self.model = AutoModelForCausalLM.from_pretrained(
            model_dir,
            torch_dtype=torch.float16,
            device_map="auto",
            trust_remote_code=True
        )
    def ask(self, question, max_tokens=400):
        q = question.strip().lower().rstrip('?!.')
        greetings = ["hi", "hello", "hey", "yo", "sup", "good morning", "how are you"]
        if any(q == g or q.startswith(g) for g in greetings):
            return "Hello! I'm PapersRAG, your AI research assistant. I have 50 recent arXiv papers on computational linguistics and NLP. Ask me anything about them!"
        identity_qs = ["who are you", "what is your name", "what are you", "what do you do", "tell me about yourself"]
        if any(idq in q for idq in identity_qs):
            return "I'm PapersRAG 🧪, a research assistant that can answer questions about the latest 50 arXiv papers in cs.CL. I'll cite the paper titles in my answers. Ask me anything about the papers!"
        q_emb = self.embedder.encode([question]).astype("float32")
        _, indices = self.index.search(q_emb, 10)
        candidates = [self.chunks[i] for i in indices[0]]
        pairs = [(question, c) for c in candidates]
        scores = self.reranker.predict(pairs)
        if max(scores) < -4.5:
            return "I don't have enough information from my arXiv papers to answer that accurately. Try asking about specific NLP or computational linguistics papers."
        best = sorted(zip(scores, candidates), reverse=True)[:4]
        context = "\\n\\n".join([c for _, c in best])
        messages = [
            {"role": "system", "content": "You are PapersRAG, a scientific research assistant. Use ONLY the provided paper abstracts to answer. Always mention the paper title when you use information from it. If unsure, say you don't have that information."},
            {"role": "user", "content": f"Context:\\n{context}\\n\\nQuestion: {question}"}
        ]
        prompt = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
        inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device)
        outputs = self.model.generate(
            **inputs,
            max_new_tokens=max_tokens,
            temperature=0.7,
            do_sample=True,
            pad_token_id=self.tokenizer.eos_token_id
        )
        answer = self.tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
        return answer.strip()
--- a/rag_config.json
+++ b/rag_config.json
@@ -0,0 +1 @@
 {"embedder_model": "intfloat/e5-base-v2"}
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,207 @@
 {
  "add_bos_token": false,
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "151643": {
      "content": "<|endoftext|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151644": {
      "content": "<|im_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151645": {
      "content": "<|im_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151646": {
      "content": "<|object_ref_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151647": {
      "content": "<|object_ref_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151648": {
      "content": "<|box_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151649": {
      "content": "<|box_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151650": {
      "content": "<|quad_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151651": {
      "content": "<|quad_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151652": {
      "content": "<|vision_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151653": {
      "content": "<|vision_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151654": {
      "content": "<|vision_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151655": {
      "content": "<|image_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151656": {
      "content": "<|video_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151657": {
      "content": "<tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151658": {
      "content": "</tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151659": {
      "content": "<|fim_prefix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151660": {
      "content": "<|fim_middle|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151661": {
      "content": "<|fim_suffix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151662": {
      "content": "<|fim_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151663": {
      "content": "<|repo_name|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151664": {
      "content": "<|file_sep|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    }
  },
  "additional_special_tokens": [
    "<|im_start|>",
    "<|im_end|>",
    "<|object_ref_start|>",
    "<|object_ref_end|>",
    "<|box_start|>",
    "<|box_end|>",
    "<|quad_start|>",
    "<|quad_end|>",
    "<|vision_start|>",
    "<|vision_end|>",
    "<|vision_pad|>",
    "<|image_pad|>",
    "<|video_pad|>"
  ],
  "bos_token": null,
  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- messages[0]['content'] }}\n    {%- else %}\n        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n    {%- endif %}\n    {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0]['role'] == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n    {%- else %}\n        {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n        {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role }}\n        {%- if message.content %}\n            {{- '\\n' + message.content }}\n        {%- endif %}\n        {%- for tool_call in message.tool_calls %}\n            {%- if tool_call.function is defined %}\n                {%- set tool_call = tool_call.function %}\n            {%- endif %}\n            {{- '\\n<tool_call>\\n{\"name\": \"' }}\n            {{- tool_call.name }}\n            {{- '\", \"arguments\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- '}\\n</tool_call>' }}\n        {%- endfor %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- message.content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "errors": "replace",
  "model_max_length": 131072,
  "pad_token": "<|endoftext|>",
  "split_special_tokens": false,
  "tokenizer_class": "Qwen2Tokenizer",
  "unk_token": null
 }
--- a/vocab.json
+++ b/vocab.json