初始化项目，由ModelHub XC社区提供模型

Model: dnotitia/DNA-R1 Source: Original Platform
2026-05-05 10:37:54 +08:00
commit 04e3557295
20 changed files with 101503 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,49 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bin.* filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zstandard filter=lfs diff=lfs merge=lfs -text
 *.tfevents* filter=lfs diff=lfs merge=lfs -text
 *.db* filter=lfs diff=lfs merge=lfs -text
 *.ark* filter=lfs diff=lfs merge=lfs -text
 **/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
 **/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
 **/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.gguf* filter=lfs diff=lfs merge=lfs -text
 *.ggml filter=lfs diff=lfs merge=lfs -text
 *.llamafile* filter=lfs diff=lfs merge=lfs -text
 *.pt2 filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,274 @@
 ---
 language:
 - en
 - ko
 license: cc-by-nc-4.0
 tags:
 - dnotitia
 - nlp
 - llm
 - slm
 - conversation
 - chat
 - reasoning
 - r1
 base_model:
 - microsoft/phi-4
 library_name: transformers
 pipeline_tag: text-generation
 ---
 # DNA-R1
 <p align="center">
 <img src="assets/dna-r1-logo.png" width="400" style="margin: 40px auto;">
 </p>
 We introduce **DNA-R1**, a specialized reasoning model optimized for Korean language based on Microsoft's Phi-4. By applying large-scale reinforcement learning (RL) using the same methodology as DeepSeek-R1, we have significantly enhanced the model's Korean reasoning capabilities. This model demonstrates deep understanding of Korean text and exhibits exceptional reasoning abilities across mathematics, coding, and general reasoning tasks.
 <p align="center">
 <img src="assets/dna-r1-pipeline.png" width="100%" style="margin: 40px auto;">
 </p>
 ## Training Methodology
 Our comprehensive training pipeline consists of three strategic stages:
 - **Stage 1:** Initial SFT with a large Korean non-reasoning dataset (760k examples) reused from our [DNA 1.0 8B Instruct](https://huggingface.co/dnotitia/Llama-DNA-1.0-8B-Instruct) training pipeline
 - **Stage 2:** Strategic integration of Korean reasoning patterns from DeepSeek R1 using a specialized Korean reasoning dataset (300k examples)
 - **Stage 3:** Advanced reinforcement learning with GRPO using a combined Korean/English reasoning dataset, with format, accuracy, and language consistency as rewards
 DNA-R1 has learned reasoning patterns specifically tailored for Korean language, and demonstrates capabilities such as self-verification, reflection, and generation of long chains-of-thought (CoT). This represents a significant milestone for the AI research community in the Korean language environment.
 ## Model Specifications
 - **Developed by:** Dnotitia Inc.
 - **Supported Languages:** Korean, English
 - **Model Release Date:** Mar 6, 2025
 - **Number of Parameters:** 14B
 - **License:** CC BY-NC 4.0
 <div style="padding: 2px 8px; background-color: hsl(240, 100%, 50%, 0.1); border-radius: 5px">
  <p><strong>NOTICE (Korean):</strong></p>
  <p>본 모델은 상업적 목적으로 활용하실 수 있습니다. 상업적 이용을 원하시는 경우, 디노티시아 홈페이지의 <a href="https://www.dnotitia.com/contact/post-form">Contact us</a>를 통해 문의해 주시기 바랍니다. 간단한 협의 절차를 거쳐 상업적 활용을 승인해 드리도록 하겠습니다.</p>
 </div>
 ## Technical Details
 ### Multi-Stage Training Pipeline
 We implemented a sophisticated training approach to enhance Phi-4's Korean reasoning capabilities:
 1. **Initial Foundation (Stage 1):** Supervised Fine-Tuning using our extensive Korean non-reasoning dataset from the established [DNA 1.0 8B Instruct](https://huggingface.co/dnotitia/Llama-DNA-1.0-8B-Instruct) training pipeline
 2. **Reasoning Integration (Stage 2):** Specialized adaptation of DeepSeek R1's reasoning patterns with Korean-specific optimization through a meticulously curated dataset
 3. **Advanced Refinement (Stage 3):** Reinforcement learning optimization using GRPO to perfect reasoning in both Korean and English, with comprehensive reward signals for format structure, factual accuracy, and language consistency
 This methodical approach enables DNA-R1 to develop sophisticated chain-of-thought (CoT) reasoning for complex problem solving, resulting in a model finely calibrated for Korean language reasoning while maintaining robust general capabilities.
 ### Performance Highlights
 Our Korean-specific multi-stage training pipeline significantly enhances the Phi-4 base model's understanding of Korean context, reasoning depth, and response capabilities. The model excels at:
 - Generating nuanced Korean chains-of-thought (CoT)
 - Performing rigorous self-verification
 - Solving multi-step complex problems
 - Maintaining cultural and linguistic context in reasoning
 - Distinguishing between deep thinking and concise answers using the `<think>` and `<answer>` tags
 ## Evaluation Results
 Below, we present our evaluation results for the DNA-R1 model across math, coding, science, Korean, and general-performance benchmarks.
 Despite being only 14B in size, the DNA-R1 model demonstrates superior performance compared to many larger models across various benchmarks.
 <table>
  <thead>
    <tr>
      <th>Benchmark</th>
      <th>Task</th>
      <th>DNA-R1 (14B)</th>
      <th>DeepSeek-R1-Distill-Qwen-14B</th>
      <th>DeepSeek-R1-Distill-Qwen-32B</th>
      <th>EXAONE-3.5-32B-Instruct</th>
      <th>QwQ-32B-Preview</th>
      <th>gpt-4o-0513</th>
      <th>o1-mini</th>
      <th>o1-preview</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>GSM8K</td>
      <td rowspan="4">Math</td>
      <td><b>92.49</b></td>
      <td>88.63</td>
      <td>82.64</td>
      <td><u>91.9</u></td>
      <td>82.41</td>
      <td>-</td>
      <td>-</td>
      <td>-</td>
    </tr>
    <tr>
      <td>Math500</td>
      <td><u>89.4</u></td>
      <td>88.2</td>
      <td>87.4</td>
      <td>75.8</td>
      <td><b>92.2</b></td>
      <td>75.8</td>
      <td>85.6</td>
      <td>81.4</td>
    </tr>
    <tr>
      <td>AIME2024</td>
      <td>53.3</td>
      <td><u>69.7</u></td>
      <td><b>72.6</b></td>
      <td>6.67</td>
      <td>50.0</td>
      <td>8.6</td>
      <td>64.0</td>
      <td>40</td>
    </tr>
    <tr>
      <td>OlympiadBench (Math, EN)</td>
      <td><u>59.94</u></td>
      <td>56.82</td>
      <td>55.34</td>
      <td>38.58</td>
      <td><b>62.17</b></td>
      <td>-</td>
      <td>-</td>
      <td>59.2</td>
    </tr>
    <tr>
      <td>GPQA-Diamond</td>
      <td>Science/Reasoning</td>
      <td><u>61.11</u></td>
      <td>59.1</td>
      <td>58.08</td>
      <td>33.33</td>
      <td>52.5</td>
      <td>46.5</td>
      <td>60</td>
      <td><b>75.2</b></td>
    </tr>
    <tr>
      <td>LiveCodeBench</td>
      <td>Coding</td>
      <td>50.58</td>
      <td>59.88</td>
      <td><u>61.65</u></td>
      <td>19.8</td>
      <td>59.12</td>
      <td>50.48</td>
      <td><b>72.75</b></td>
      <td>59.14</td>
    </tr>
    <tr>
      <td>KMMLU-direct</td>
      <td rowspan="3">Korean</td>
      <td><u>59.9</u></td>
      <td>50.5</td>
      <td>58.62</td>
      <td>50.72</td>
      <td><b>62.96</b></td>
      <td>-</td>
      <td>-</td>
      <td>-</td>
    </tr>
    <tr>
      <td>KMMLU-hard</td>
      <td><u>36.65</u></td>
      <td>25.34</td>
      <td>33.67</td>
      <td>25.46</td>
      <td><b>37.98</b></td>
      <td>-</td>
      <td>-</td>
      <td>-</td>
    </tr>
    <tr>
      <td>KoBEST</td>
      <td>83.05</td>
      <td>74.32</td>
      <td>78.53</td>
      <td><b>86.54</b></td>
      <td><u>85.93</u></td>
      <td>-</td>
      <td>-</td>
      <td>-</td>
    </tr>
    <tr>
      <td>MMLU-Pro</td>
      <td rowspan="3">General</td>
      <td><u>57.64</u></td>
      <td>50.55</td>
      <td><b>59.58</b></td>
      <td>-</td>
      <td>46.82</td>
      <td>-</td>
      <td>-</td>
      <td>-</td>
    </tr>
  </tbody>
 </table>
 - The *highest* *scores* are in **bold** form, and the *second*\-*highest* *scores* are <u>underlined</u>.
 - All benchmarks are evaluated with [lm-eval](https://github.com/EleutherAI/lm-evaluation-harness) and [skythought-eval](https://github.com/NovaSky-AI/SkyThought/tree/main/skythought/evals).
 ## Quickstart
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
 tokenizer = AutoTokenizer.from_pretrained('dnotitia/DNA-R1')
 model = AutoModelForCausalLM.from_pretrained('dnotitia/DNA-R1', device_map='auto')
 streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
 conversation = [
    {"role": "user", "content": """
 어려서부터 우리 집은 가난했었고
 남들 다하는 외식 몇 번 한 적이 없었고
 일터에 나가신 어머니 집에 없으면
 언제나 혼자서 끓여 먹었던 라면
 그러다 라면이 너무 지겨워서
 맛있는 것 좀 먹자고 대들었었어
 그러자 어머님이 마지못해 꺼내신
 숨겨두신 비상금으로 시켜주신
 짜장면 하나에 너무나 행복했었어
 하지만 어머님은 왠지 드시질 않았어
 어머님은 짜장면이 싫다고 하셨어
 어머님은 짜장면이 싫다고 하셨어
 야이야~야 그렇게 살아가고
 그렇게 후회하고 눈물도 흘리고
 야이야~야 그렇게 살아가고
 너무나 아프고 하지만 다시 웃고
 ---
 친구가 쓴 시인데, 여기서 친구의 어머니가 짜장면이 싫다고 하신 이유는?사랑or희생?"""},
 ]
 inputs = tokenizer.apply_chat_template(conversation,
                                       add_generation_prompt=True,
                                       return_dict=True,
                                       return_tensors="pt").to(model.device)
 _ = model.generate(**inputs, streamer=streamer)
 ```
 ## License
 This model is released under CC BY-NC 4.0 license. If you have any questions or commercial usage inquiries, please [Contact us](https://www.dnotitia.com/contact/post-form).
 ## Citation
 If you use or discuss this model in your academic research, please cite the project to help spread awareness:
 ```
@misc{dnar12025,
      title={DNA R1}, 
      author={Jungyup Lee and Jemin Kim and Sang Park and SeungJae Lee},
      year={2025},
      publisher={HuggingFace},
      url={https://huggingface.co/dnotitia/DNA-R1}
 }
 ```
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,6 @@
 {
  "</answer>": 100355,
  "</think>": 100353,
  "<answer>": 100354,
  "<think>": 100352
 }
--- a/assets/dna-r1-logo.png
+++ b/assets/dna-r1-logo.png
--- a/assets/dna-r1-pipeline.png
+++ b/assets/dna-r1-pipeline.png
--- a/config.json
+++ b/config.json
@@ -0,0 +1,33 @@
 {
  "architectures": [
    "Phi3ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "auto_map": {},
  "bos_token_id": 100257,
  "embd_pdrop": 0.0,
  "eos_token_id": 100265,
  "hidden_act": "silu",
  "hidden_size": 5120,
  "initializer_range": 0.02,
  "intermediate_size": 17920,
  "max_position_embeddings": 16384,
  "model_type": "phi3",
  "num_attention_heads": 40,
  "num_hidden_layers": 40,
  "num_key_value_heads": 10,
  "original_max_position_embeddings": 16384,
  "pad_token_id": 100349,
  "partial_rotary_factor": 1.0,
  "resid_pdrop": 0.0,
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "rope_theta": 250000,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.46.1",
  "use_cache": false,
  "vocab_size": 100356
 }
--- a/configuration.json
+++ b/configuration.json
@@ -0,0 +1 @@
 {"framework": "pytorch", "task": "text-generation", "allow_remote": true}
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,16 @@
 {
  "_from_model_config": true,
  "bos_token_id": 100257,
  "eos_token_id": [
    100265,
    100355
  ],
  "pad_token_id": 100349,
  "do_sample": true,
  "temperature": 0.1,
  "top_p": 0.9,
  "max_new_tokens": 4096,
  "frequency_penalty": 0.1,
  "presence_penalty": 0.1,
  "transformers_version": "4.46.1"
 }
--- a/merges.txt
+++ b/merges.txt
--- a/model-00001-of-00006.safetensors
+++ b/model-00001-of-00006.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:41481ba549af54b38e816c1b43d62a065b0d0c474c103b4340c72dfc399a60cf
 size 4933697432
--- a/model-00002-of-00006.safetensors
+++ b/model-00002-of-00006.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:fa8e5614a9c48e82256edb75c334d81c2f835dac6ec854777ea7196b388e74c4
 size 4954690712
--- a/model-00003-of-00006.safetensors
+++ b/model-00003-of-00006.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:cb38373b18fa49c5bf5ae99aac1872cae895813c4cac4c593e3b64542261f24d
 size 4902241352
--- a/model-00004-of-00006.safetensors
+++ b/model-00004-of-00006.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:dfd0ba68b3d6b1a27520400aad6ea1c8fd8c054e62994c0b59ee52b6be0ecbb7
 size 4771169120
--- a/model-00005-of-00006.safetensors
+++ b/model-00005-of-00006.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:e541d89e6893dcf60ffa9f05e98f1f51eaa79de329a6fcee74b744efd7906fe5
 size 4771169120
--- a/model-00006-of-00006.safetensors
+++ b/model-00006-of-00006.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:044d171b218d543fe18c41ea84119baac71a1d766958726d2025d934c95eff61
 size 4986157176
--- a/model.safetensors.index.json
+++ b/model.safetensors.index.json
@@ -0,0 +1,250 @@
 {
  "metadata": {
    "total_size": 29319096320
  },
  "weight_map": {
    "lm_head.weight": "model-00006-of-00006.safetensors",
    "model.embed_tokens.weight": "model-00001-of-00006.safetensors",
    "model.layers.0.input_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.0.mlp.gate_up_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.0.self_attn.qkv_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.1.input_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.1.mlp.gate_up_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.1.self_attn.qkv_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.10.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.10.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.10.mlp.gate_up_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.10.self_attn.qkv_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.11.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.11.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.11.mlp.gate_up_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.11.self_attn.qkv_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.12.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.12.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.12.mlp.gate_up_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.12.self_attn.qkv_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.13.input_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.13.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.13.mlp.gate_up_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.13.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.13.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.13.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.14.input_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.14.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.14.mlp.gate_up_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.14.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.14.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.14.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.15.input_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.15.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.15.mlp.gate_up_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.15.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.15.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.15.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.16.input_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.16.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.16.mlp.gate_up_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.16.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.16.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.16.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.17.input_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.17.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.17.mlp.gate_up_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.17.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.17.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.18.input_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.18.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.18.mlp.gate_up_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.18.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.18.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.19.input_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.19.mlp.down_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.19.mlp.gate_up_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.19.post_attention_layernorm.weight": "model-00003-of-00006.safetensors",
    "model.layers.19.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.19.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.2.input_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.2.mlp.gate_up_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.2.self_attn.qkv_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.20.input_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.20.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.20.mlp.gate_up_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.20.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.20.self_attn.qkv_proj.weight": "model-00003-of-00006.safetensors",
    "model.layers.21.input_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.21.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.21.mlp.gate_up_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.21.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.21.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.21.self_attn.qkv_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.22.input_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.22.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.22.mlp.gate_up_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.22.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.22.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.22.self_attn.qkv_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.23.input_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.23.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.23.mlp.gate_up_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.23.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.23.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.23.self_attn.qkv_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.24.input_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.24.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.24.mlp.gate_up_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.24.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.24.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.24.self_attn.qkv_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.25.input_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.25.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.25.mlp.gate_up_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.25.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.25.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.25.self_attn.qkv_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.26.input_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.26.mlp.down_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.26.mlp.gate_up_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.26.post_attention_layernorm.weight": "model-00004-of-00006.safetensors",
    "model.layers.26.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.26.self_attn.qkv_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.27.input_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.27.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.27.mlp.gate_up_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.27.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.27.self_attn.o_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.27.self_attn.qkv_proj.weight": "model-00004-of-00006.safetensors",
    "model.layers.28.input_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.28.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.28.mlp.gate_up_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.28.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.28.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.28.self_attn.qkv_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.29.input_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.29.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.29.mlp.gate_up_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.29.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.29.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.29.self_attn.qkv_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.3.input_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.3.mlp.gate_up_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.3.self_attn.qkv_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.30.input_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.30.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.30.mlp.gate_up_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.30.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.30.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.30.self_attn.qkv_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.31.input_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.31.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.31.mlp.gate_up_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.31.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.31.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.31.self_attn.qkv_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.32.input_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.32.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.32.mlp.gate_up_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.32.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.32.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.32.self_attn.qkv_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.33.input_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.33.mlp.down_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.33.mlp.gate_up_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.33.post_attention_layernorm.weight": "model-00005-of-00006.safetensors",
    "model.layers.33.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.33.self_attn.qkv_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.34.input_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.34.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.34.mlp.gate_up_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.34.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.34.self_attn.o_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.34.self_attn.qkv_proj.weight": "model-00005-of-00006.safetensors",
    "model.layers.35.input_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.35.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.35.mlp.gate_up_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.35.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.35.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.35.self_attn.qkv_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.36.input_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.36.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.36.mlp.gate_up_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.36.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.36.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.36.self_attn.qkv_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.37.input_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.37.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.37.mlp.gate_up_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.37.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.37.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.37.self_attn.qkv_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.38.input_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.38.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.38.mlp.gate_up_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.38.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.38.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.38.self_attn.qkv_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.39.input_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.39.mlp.down_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.39.mlp.gate_up_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.39.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
    "model.layers.39.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.39.self_attn.qkv_proj.weight": "model-00006-of-00006.safetensors",
    "model.layers.4.input_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.4.mlp.gate_up_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00006.safetensors",
    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.4.self_attn.qkv_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.5.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.5.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.5.mlp.gate_up_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.5.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.5.self_attn.qkv_proj.weight": "model-00001-of-00006.safetensors",
    "model.layers.6.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.6.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.6.mlp.gate_up_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.6.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.6.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.6.self_attn.qkv_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.7.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.7.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.7.mlp.gate_up_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.7.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.7.self_attn.qkv_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.8.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.8.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.8.mlp.gate_up_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.8.self_attn.qkv_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.9.input_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.9.mlp.down_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.9.mlp.gate_up_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00006.safetensors",
    "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00006.safetensors",
    "model.layers.9.self_attn.qkv_proj.weight": "model-00002-of-00006.safetensors",
    "model.norm.weight": "model-00006-of-00006.safetensors"
  }
 }
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,33 @@
 {
  "additional_special_tokens": [
    "<|im_end|>"
  ],
  "bos_token": {
    "content": "<|endoftext|>",
    "lstrip": true,
    "normalized": false,
    "rstrip": true,
    "single_word": false
  },
  "eos_token": {
    "content": "<|im_end|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": {
    "content": "<|dummy_85|>",
    "lstrip": true,
    "normalized": false,
    "rstrip": true,
    "single_word": false
  },
  "unk_token": {
    "content": "<|endoftext|>",
    "lstrip": true,
    "normalized": false,
    "rstrip": true,
    "single_word": false
  }
 }
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0c0835acb86f383b9741ed3681ae31827f02738ce071553f45b56cb37ac578fe
 size 7153991
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,818 @@
 {
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "100256": {
      "content": "<|dummy_0|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100257": {
      "content": "<|endoftext|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100258": {
      "content": "<|fim_prefix|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100259": {
      "content": "<|fim_middle|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100260": {
      "content": "<|fim_suffix|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100261": {
      "content": "<|dummy_1|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100262": {
      "content": "<|dummy_2|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100263": {
      "content": "<|dummy_3|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100264": {
      "content": "<|im_start|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100265": {
      "content": "<|im_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "100266": {
      "content": "<|im_sep|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100267": {
      "content": "<|dummy_4|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100268": {
      "content": "<|dummy_5|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100269": {
      "content": "<|dummy_6|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100270": {
      "content": "<|dummy_7|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100271": {
      "content": "<|dummy_8|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100272": {
      "content": "<|dummy_9|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100273": {
      "content": "<|dummy_10|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100274": {
      "content": "<|dummy_11|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100275": {
      "content": "<|dummy_12|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100276": {
      "content": "<|endofprompt|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100277": {
      "content": "<|dummy_13|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100278": {
      "content": "<|dummy_14|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100279": {
      "content": "<|dummy_15|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100280": {
      "content": "<|dummy_16|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100281": {
      "content": "<|dummy_17|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100282": {
      "content": "<|dummy_18|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100283": {
      "content": "<|dummy_19|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100284": {
      "content": "<|dummy_20|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100285": {
      "content": "<|dummy_21|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100286": {
      "content": "<|dummy_22|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100287": {
      "content": "<|dummy_23|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100288": {
      "content": "<|dummy_24|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100289": {
      "content": "<|dummy_25|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100290": {
      "content": "<|dummy_26|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100291": {
      "content": "<|dummy_27|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100292": {
      "content": "<|dummy_28|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100293": {
      "content": "<|dummy_29|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100294": {
      "content": "<|dummy_30|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100295": {
      "content": "<|dummy_31|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100296": {
      "content": "<|dummy_32|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100297": {
      "content": "<|dummy_33|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100298": {
      "content": "<|dummy_34|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100299": {
      "content": "<|dummy_35|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100300": {
      "content": "<|dummy_36|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100301": {
      "content": "<|dummy_37|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100302": {
      "content": "<|dummy_38|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100303": {
      "content": "<|dummy_39|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100304": {
      "content": "<|dummy_40|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100305": {
      "content": "<|dummy_41|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100306": {
      "content": "<|dummy_42|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100307": {
      "content": "<|dummy_43|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100308": {
      "content": "<|dummy_44|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100309": {
      "content": "<|dummy_45|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100310": {
      "content": "<|dummy_46|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100311": {
      "content": "<|dummy_47|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100312": {
      "content": "<|dummy_48|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100313": {
      "content": "<|dummy_49|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100314": {
      "content": "<|dummy_50|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100315": {
      "content": "<|dummy_51|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100316": {
      "content": "<|dummy_52|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100317": {
      "content": "<|dummy_53|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100318": {
      "content": "<|dummy_54|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100319": {
      "content": "<|dummy_55|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100320": {
      "content": "<|dummy_56|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100321": {
      "content": "<|dummy_57|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100322": {
      "content": "<|dummy_58|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100323": {
      "content": "<|dummy_59|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100324": {
      "content": "<|dummy_60|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100325": {
      "content": "<|dummy_61|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100326": {
      "content": "<|dummy_62|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100327": {
      "content": "<|dummy_63|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100328": {
      "content": "<|dummy_64|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100329": {
      "content": "<|dummy_65|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100330": {
      "content": "<|dummy_66|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100331": {
      "content": "<|dummy_67|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100332": {
      "content": "<|dummy_68|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100333": {
      "content": "<|dummy_69|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100334": {
      "content": "<|dummy_70|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100335": {
      "content": "<|dummy_71|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100336": {
      "content": "<|dummy_72|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100337": {
      "content": "<|dummy_73|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100338": {
      "content": "<|dummy_74|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100339": {
      "content": "<|dummy_75|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100340": {
      "content": "<|dummy_76|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100341": {
      "content": "<|dummy_77|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100342": {
      "content": "<|dummy_78|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100343": {
      "content": "<|dummy_79|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100344": {
      "content": "<|dummy_80|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100345": {
      "content": "<|dummy_81|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100346": {
      "content": "<|dummy_82|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100347": {
      "content": "<|dummy_83|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100348": {
      "content": "<|dummy_84|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100349": {
      "content": "<|dummy_85|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100350": {
      "content": "<|dummy_86|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100351": {
      "content": "<|dummy_87|>",
      "lstrip": true,
      "normalized": false,
      "rstrip": true,
      "single_word": false,
      "special": true
    },
    "100352": {
      "content": "<think>",
      "lstrip": false,
      "normalized": true,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "100353": {
      "content": "</think>",
      "lstrip": false,
      "normalized": true,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "100354": {
      "content": "<answer>",
      "lstrip": false,
      "normalized": true,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "100355": {
      "content": "</answer>",
      "lstrip": false,
      "normalized": true,
      "rstrip": false,
      "single_word": false,
      "special": false
    }
  },
  "additional_special_tokens": [
    "<|im_end|>"
  ],
  "bos_token": "<|endoftext|>",
  "chat_template": "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% endif %}{% if system_message is defined %}{{'<|im_start|>system<|im_sep|>' + system_message + '<|im_end|>'}}{% endif %}{% for message in loop_messages %}{% if message['role'] == 'user' %}{{'<|im_start|>user<|im_sep|>\\n' + message['content'] + '\\n<|im_end|>'}}{% endif %}{% if message['role'] == 'assistant' and message['content'] is not none %}{% set content = message['content'] %}{% if '</think>' in content %}{% set content = content.split('</think>')[-1] %}{% endif %}{{'<|im_start|>assistant<|im_sep|>' + content + '<|im_end|>'}}{% endif %}{% endfor %}{% if add_generation_prompt %}{{'<|im_start|>assistant<|im_sep|><think>'}}{% endif %}",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "extra_special_tokens": {},
  "model_max_length": 16384,
  "pad_token": "<|dummy_85|>",
  "split_special_tokens": false,
  "tokenizer_class": "GPT2Tokenizer",
  "unk_token": "<|endoftext|>"
 }
--- a/vocab.json
+++ b/vocab.json
		`@@ -0,0 +1 @@`
							`{"framework": "pytorch", "task": "text-generation", "allow_remote": true}`