初始化项目，由ModelHub XC社区提供模型

Model: strykes/emberforge-3b-reasoner Source: Original Platform
2026-05-30 19:09:18 +08:00
commit 7c36fbd792
28 changed files with 5552 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,39 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
+gguf/Nanbeige4.1-3B-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+gguf/Nanbeige4.1-3B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+gguf/Nanbeige4.1-3B-f16.gguf filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,76 @@
+---
+language:
+- en
+license: apache-2.0
+tags:
+- transformers
+- safetensors
+- gguf
+- peft
+- qlora
+- reasoning
+base_model:
+- Nanbeige/Nanbeige4.1-3B
+library_name: transformers
+pipeline_tag: text-generation
+---
+
+# EmberForge-3B-Reasoner
+
+Private finetuned Nanbeige4.1-3B reasoning release by `strykes`.
+
+## Included Artifacts
+
+- Merged full model (Safetensors) at repo root for HF benchmarking
+- LoRA adapter in `adapter/`
+- GGUF in `gguf/`:
+  - `Nanbeige4.1-3B-Q5_K_M.gguf`
+  - `Nanbeige4.1-3B-Q4_K_M.gguf`
+  - `Nanbeige4.1-3B-f16.gguf`
+- Optional archive in `archives/`
+
+## Training Snapshot
+
+- Base: `Nanbeige/Nanbeige4.1-3B`
+- Method: Unsloth QLoRA -> merged weights
+- Data: ~3.5k synthetic reasoning samples
+- Epochs: 2
+- Sequence length: 4096
+
+## Notes
+
+- Intended for research and benchmarking.
+- Validate outputs before critical use.
+
+## Benchmarks (2026-02-24)
+
+### Local lm-eval results (this finetune)
+
+| Task | Metric | Score |
+|---|---:|---:|
+| mmlu | acc,none | 59.98% |
+| gsm8k | exact_match,flexible-extract | 62.40% |
+| arc_challenge | acc_norm,none | 31.74% |
+| hellaswag | acc_norm,none | 56.07% |
+| winogrande | acc,none | 50.04% |
+| piqa | acc_norm,none | 63.22% |
+| boolq | acc,none | 74.37% |
+| truthfulqa_mc2 | acc,none | 45.34% |
+
+### Public references
+
+- Base model (`Nanbeige/Nanbeige4.1-3B`) author-published benchmarks are listed in:
+  - `benchmarks/lm-eval-2026-02-24/benchmark_comparison_public_2026-02-24.md`
+- Frontier references (Claude/GPT/Gemini) are included in the same comparison report.
+
+### Reproducibility artifacts
+
+- `benchmarks/lm-eval-2026-02-24/summary_v3.tsv`
+- `benchmarks/lm-eval-2026-02-24/results_2026-02-24T00-06-21.474293.json`
+- `benchmarks/lm-eval-2026-02-24/run_v3.log`
+- `benchmarks/lm-eval-2026-02-24/benchmark_comparison_public_2026-02-24.md`
+
+### Caveat
+
+Public model-card comparisons are not always apples-to-apples with lm-evaluation-harness settings (prompting, few-shot, decoding, and benchmark versions can differ).
+
--- a/adapter/README.md
+++ b/adapter/README.md
@@ -0,0 +1,210 @@
+---
+base_model: Nanbeige/Nanbeige4.1-3B
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:Nanbeige/Nanbeige4.1-3B
+- lora
+- sft
+- transformers
+- trl
+- unsloth
+---
+
+# Model Card for Model ID
+
+<!-- Provide a quick summary of what the model is/does. -->
+
+
+
+## Model Details
+
+### Model Description
+
+<!-- Provide a longer summary of what this model is. -->
+
+
+
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+
+### Model Sources [optional]
+
+<!-- Provide the basic links for the model. -->
+
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+
+## Uses
+
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+
+### Direct Use
+
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+
+[More Information Needed]
+
+### Downstream Use [optional]
+
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+
+[More Information Needed]
+
+### Out-of-Scope Use
+
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+
+[More Information Needed]
+
+## Bias, Risks, and Limitations
+
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+
+[More Information Needed]
+
+### Recommendations
+
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+
+## How to Get Started with the Model
+
+Use the code below to get started with the model.
+
+[More Information Needed]
+
+## Training Details
+
+### Training Data
+
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+
+[More Information Needed]
+
+### Training Procedure
+
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+
+#### Preprocessing [optional]
+
+[More Information Needed]
+
+
+#### Training Hyperparameters
+
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+
+#### Speeds, Sizes, Times [optional]
+
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+
+[More Information Needed]
+
+## Evaluation
+
+<!-- This section describes the evaluation protocols and provides the results. -->
+
+### Testing Data, Factors & Metrics
+
+#### Testing Data
+
+<!-- This should link to a Dataset Card if possible. -->
+
+[More Information Needed]
+
+#### Factors
+
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+
+[More Information Needed]
+
+#### Metrics
+
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+
+[More Information Needed]
+
+### Results
+
+[More Information Needed]
+
+#### Summary
+
+
+
+## Model Examination [optional]
+
+<!-- Relevant interpretability work for the model goes here -->
+
+[More Information Needed]
+
+## Environmental Impact
+
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+
+## Technical Specifications [optional]
+
+### Model Architecture and Objective
+
+[More Information Needed]
+
+### Compute Infrastructure
+
+[More Information Needed]
+
+#### Hardware
+
+[More Information Needed]
+
+#### Software
+
+[More Information Needed]
+
+## Citation [optional]
+
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+
+**BibTeX:**
+
+[More Information Needed]
+
+**APA:**
+
+[More Information Needed]
+
+## Glossary [optional]
+
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+
+[More Information Needed]
+
+## More Information [optional]
+
+[More Information Needed]
+
+## Model Card Authors [optional]
+
+[More Information Needed]
+
+## Model Card Contact
+
+[More Information Needed]
+### Framework versions
+
+- PEFT 0.18.1
--- a/adapter/adapter_config.json
+++ b/adapter/adapter_config.json
@@ -0,0 +1,50 @@
+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "LlamaForCausalLM",
+    "parent_library": "transformers.models.llama.modeling_llama",
+    "unsloth_fixed": true
+  },
+  "base_model_name_or_path": "Nanbeige/Nanbeige4.1-3B",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "down_proj",
+    "up_proj",
+    "gate_proj",
+    "o_proj",
+    "k_proj",
+    "v_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}
--- a/adapter/adapter_model.safetensors
+++ b/adapter/adapter_model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7983f9ec6827018eeffa27618229f4c6a1326ee107c8fbe2c268301afcb47e22
+size 455142376
--- a/adapter/added_tokens.json
+++ b/adapter/added_tokens.json
@@ -0,0 +1,9 @@
+{
+  "</think>": 166104,
+  "</tool_call>": 166106,
+  "<think>": 166103,
+  "<tool_call>": 166105,
+  "<|endoftext|>": 166102,
+  "<|im_end|>": 166101,
+  "<|im_start|>": 166100
+}
--- a/adapter/chat_template.jinja
+++ b/adapter/chat_template.jinja
@@ -0,0 +1,137 @@
+
+        {%- if tools %}
+            {{- '<|im_start|>system
+' }}
+            {%- if messages[0].role == 'system' %}
+                {{- messages[0].content + '
+
+' }}
+            {%- else %} 
+                {{- '你是一位工具函数调用专家，你会得到一个问题和一组可能的工具函数。根据问题，你需要进行一个或多个函数/工具调用以实现目的，请尽量尝试探索通过工具解决问题。
+如果没有一个函数可以使用，请直接使用自然语言回复用户。
+如果给定的问题缺少函数所需的参数，请使用自然语言进行提问，向用户询问必要信息。
+如果调用结果已经足够回答用户问题，请对历史结果进行总结，使用自然语言回复用户。' }} 
+            {%- endif %}
+            {{- "# Tools
+
+You may call one or more functions to assist with the user query.
+
+You are provided with function signatures within <tools></tools> XML tags:
+<tools>" }}
+            {%- for tool in tools %}
+                {{- "
+" }}
+                {{- tool | tojson }}
+            {%- endfor %}
+            {{- "
+</tools>
+
+For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
+<tool_call>
+{\"name\": <function-name>, \"arguments\": <args-json-object>}
+</tool_call><|im_end|>
+" }}
+        {%- else %}
+            {%- if messages[0].role == 'system' %}
+                {{- '<|im_start|>system
+' + messages[0].content + '<|im_end|>
+' }}
+            {%- else %} 
+                {{- '<|im_start|>system
+你是南北阁，一款由BOSS直聘自主研发并训练的专业大语言模型。<|im_end|>
+' }} 
+            {%- endif %}
+        {%- endif %}
+        {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+        {%- for message in messages[::-1] %}
+            {%- set index = (messages|length - 1) - loop.index0 %}
+            {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
+                {%- set ns.multi_step_tool = false %}
+                {%- set ns.last_query_index = index %}
+            {%- endif %}
+        {%- endfor %}
+        {%- for message in messages %}
+            {%- if message.content is string %}
+                {%- set content = message.content %}
+            {%- else %}
+                {%- set content = '' %}
+            {%- endif %}
+            {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+                {{- '<|im_start|>' + message.role + '
+' + content + '<|im_end|>' + '
+' }}
+            {%- elif message.role == "assistant" %}
+                {%- set reasoning_content = '' %}
+                {%- if message.reasoning_content is string %}
+                    {%- set reasoning_content = message.reasoning_content %}
+                {%- else %}
+                    {%- if '</think>' in content %}
+                        {%- set reasoning_content = content.split('</think>')[0].rstrip('
+').split('<think>')[-1].lstrip('
+') %}
+                        {%- set content = content.split('</think>')[-1].lstrip('
+') %}
+                    {%- endif %}
+                {%- endif %}
+                {%- if loop.index0 > ns.last_query_index or keep_all_think or (extra_body is defined and extra_body.keep_all_think) %}
+                    {%- if loop.last or (not loop.last and reasoning_content) %}
+                        {{- '<|im_start|>' + message.role + '
+<think>
+' + reasoning_content.strip('
+') + '
+</think>
+
+' + content.lstrip('
+') }}
+                    {%- else %}
+                        {{- '<|im_start|>' + message.role + '
+' + content }}
+                    {%- endif %}
+                {%- else %}
+                    {{- '<|im_start|>' + message.role + '
+' + content }}
+                {%- endif %}
+                {%- if message.tool_calls %}
+                    {%- for tool_call in message.tool_calls %}
+                        {%- if (loop.first and content) or (not loop.first) %}
+                            {{- '
+' }}
+                        {%- endif %}
+                        {%- if tool_call.function %}
+                            {%- set tool_call = tool_call.function %}
+                        {%- endif %}
+                        {{- '<tool_call>
+{"name": "' }}
+                        {{- tool_call.name }}
+                        {{- '", "arguments": ' }}
+                        {%- if tool_call.arguments is string %}
+                            {{- tool_call.arguments }}
+                        {%- else %}
+                            {{- tool_call.arguments | tojson }}
+                        {%- endif %}
+                        {{- '}
+</tool_call>' }}
+                    {%- endfor %}
+                {%- endif %}
+                {{- '<|im_end|>
+' }}
+            {%- elif message.role == "tool" %}
+                {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+                    {{- '<|im_start|>user' }}
+                {%- endif %}
+                {{- '
+<tool_response>
+' }}
+                {{- content }}
+                {{- '
+</tool_response>' }}
+                {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+                    {{- '<|im_end|>
+' }}
+                {%- endif %}
+            {%- endif %}
+        {%- endfor %}
+        {%- if add_generation_prompt %}
+            {{- '<|im_start|>assistant
+' }}
+        {%- endif %}
--- a/adapter/special_tokens_map.json
+++ b/adapter/special_tokens_map.json
@@ -0,0 +1,33 @@
+{
+  "additional_special_tokens": [
+    "<|endoftext|>"
+  ],
+  "bos_token": {
+    "content": "<|im_start|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/adapter/tokenizer.model
+++ b/adapter/tokenizer.model
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fb41d04798b714520a9b075727b0226538b7330254299062742c50ec8374bc36
+size 2782298
--- a/adapter/tokenizer_config.json
+++ b/adapter/tokenizer_config.json
@@ -0,0 +1,103 @@
+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "add_prefix_space": true,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166100": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166101": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166102": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166103": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "166104": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "166105": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "166106": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|endoftext|>"
+  ],
+  "bos_token": "<|im_start|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "extra_special_tokens": {},
+  "legacy": false,
+  "model_max_length": 262144,
+  "pad_token": "<unk>",
+  "padding_side": "left",
+  "sp_model_kwargs": {},
+  "spaces_between_special_tokens": false,
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,9 @@
+{
+  "</think>": 166104,
+  "</tool_call>": 166106,
+  "<think>": 166103,
+  "<tool_call>": 166105,
+  "<|endoftext|>": 166102,
+  "<|im_end|>": 166101,
+  "<|im_start|>": 166100
+}
--- a/benchmarks/lm-eval-2026-02-24/benchmark_comparison_public_2026-02-24.md
+++ b/benchmarks/lm-eval-2026-02-24/benchmark_comparison_public_2026-02-24.md
@@ -0,0 +1,70 @@
+# Emberforge 3B Benchmark Comparison (Public + Local)
+
+Generated: 2026-02-24
+
+## 1) Your Finetuned Model (local lm-eval run)
+Model: `strykes/emberforge-3b-reasoner`
+
+| Task | Metric | Score |
+|---|---:|---:|
+| mmlu | acc,none | 59.98% |
+| gsm8k | exact_match,flexible-extract | 62.40% |
+| arc_challenge | acc_norm,none | 31.74% |
+| hellaswag | acc_norm,none | 56.07% |
+| winogrande | acc,none | 50.04% |
+| piqa | acc_norm,none | 63.22% |
+| boolq | acc,none | 74.37% |
+| truthfulqa_mc2 | acc,none | 45.34% |
+
+## 2) Public Base Model (Nanbeige4.1-3B)
+Model: `Nanbeige/Nanbeige4.1-3B` (author-reported benchmarks)
+
+| Benchmark | Published Score |
+|---|---:|
+| Live-Code-Bench-V6 | 76.90% |
+| AIME 2026 I | 87.40% |
+| HMMT Nov | 77.92% |
+| GPQA | 83.80% |
+| HLE (Text-only) | 12.60% |
+| Arena-Hard-v2 | 73.20% |
+| BFCL-V4 | 56.50% |
+| Tau2-Bench | 48.57% |
+
+Note: Nanbeige published benchmarks do not overlap directly with your lm-eval task set (`mmlu`, `gsm8k`, `arc_challenge`, etc.), so no exact apples-to-apples delta can be computed without rerunning identical tasks.
+
+## 3) Public Frontier Reference (Claude / GPT / Gemini) on overlapping classic tasks
+Source benchmark table: Anthropic Claude 3 model card (March 2024).
+
+| Benchmark | Your model | Claude 3 Opus | Claude 3 Sonnet | GPT-4 | Gemini 1.0 Ultra | Gemini 1.5 Pro |
+|---|---:|---:|---:|---:|---:|---:|
+| MMLU (5-shot) | 59.98% | 86.80% | 79.00% | 86.40% | 83.70% | 81.90% |
+| GSM8K | 62.40% | 95.00% | 92.30% | 92.00% | 94.40% | 91.70% |
+| ARC-Challenge (25-shot) | 31.74% | 96.40% | 93.20% | 96.30% | — | — |
+| HellaSwag (10-shot) | 56.07% | 95.40% | 89.00% | 95.30% | 87.80% | 92.50% |
+| WinoGrande (5-shot) | 50.04% | 88.50% | 75.10% | 87.50% | — | — |
+
+## 4) Latest Frontier Snapshot (2025-2026, non-overlapping tasks)
+Source benchmark table: Claude Opus 4.5 system card, Table 2.3.A.
+
+| Benchmark | Claude Opus 4.5 | Claude Sonnet 4.5 | Claude Opus 4.1 | Gemini 3 Pro | GPT-5.1 |
+|---|---:|---:|---:|---:|---:|
+| SWE-bench Verified | 80.9% | 77.2% | 74.5% | 76.2% | 76.3% |
+| Terminal-bench 2.0 | 59.3% | 50.0% | 46.5% | 54.2% | 47.6% |
+| ARC-AGI-2 (Verified) | 37.6% | 13.6% | — | 31.1% | 17.6% |
+| GPQA Diamond | 87.0% | 83.4% | 81.0% | 91.9% | 88.1% |
+| MMMU (validation) | 80.7% | 77.8% | 77.1% | — | 85.4% |
+| MMMLU | 90.8% | 89.1% | 89.5% | 91.8% | 91.0% |
+
+Note: These are newer references but still not directly comparable to your current lm-eval task set.
+
+## 5) Caveats
+- Your run uses `lm-evaluation-harness` with specific settings; public model-card numbers may use different prompts, few-shot counts, decoding, or evaluation code.
+- Frontier references in Section 3 are older than current 2026 generations but are official primary-source numbers on overlapping classic benchmarks.
+- Frontier references in Section 4 are current (2025-2026) but mostly on different benchmarks.
+
+## Sources
+- Local run artifact: `/workspace/evals/main_results_v3.json/strykes__emberforge-3b-reasoner/results_2026-02-24T00-06-21.474293.json`
+- Nanbeige model card: https://huggingface.co/Nanbeige/Nanbeige4.1-3B
+- Anthropic Claude 3 model card (benchmarks table): https://www-cdn.anthropic.com/c6a80a657af445f40e31afac050f3bf76d3b1404.pdf
+- Anthropic model cards index: https://www.anthropic.com/system-cards
+- Anthropic Claude Opus 4.5 system card: https://www-cdn.anthropic.com/bf10f64990cfda0ba858290be7b8cc6317685f47.pdf
--- a/benchmarks/lm-eval-2026-02-24/results_2026-02-24T00-06-21.474293.json
+++ b/benchmarks/lm-eval-2026-02-24/results_2026-02-24T00-06-21.474293.json
--- a/benchmarks/lm-eval-2026-02-24/run_v3.log
+++ b/benchmarks/lm-eval-2026-02-24/run_v3.log
--- a/benchmarks/lm-eval-2026-02-24/summary_v3.tsv
+++ b/benchmarks/lm-eval-2026-02-24/summary_v3.tsv
@@ -0,0 +1,70 @@
+task	metric	value
+arc_challenge	acc_norm,none	0.3174061433447099
+boolq	acc,none	0.7437308868501529
+gsm8k	exact_match,flexible-extract	0.6239575435936315
+hellaswag	acc_norm,none	0.560744871539534
+mmlu	acc,none	0.5997721122347244
+mmlu_abstract_algebra	acc,none	0.43
+mmlu_anatomy	acc,none	0.6074074074074074
+mmlu_astronomy	acc,none	0.6973684210526315
+mmlu_business_ethics	acc,none	0.62
+mmlu_clinical_knowledge	acc,none	0.6415094339622641
+mmlu_college_biology	acc,none	0.8263888888888888
+mmlu_college_chemistry	acc,none	0.53
+mmlu_college_computer_science	acc,none	0.54
+mmlu_college_mathematics	acc,none	0.5
+mmlu_college_medicine	acc,none	0.5953757225433526
+mmlu_college_physics	acc,none	0.5
+mmlu_computer_security	acc,none	0.68
+mmlu_conceptual_physics	acc,none	0.5872340425531914
+mmlu_econometrics	acc,none	0.35964912280701755
+mmlu_electrical_engineering	acc,none	0.6413793103448275
+mmlu_elementary_mathematics	acc,none	0.5317460317460317
+mmlu_formal_logic	acc,none	0.5
+mmlu_global_facts	acc,none	0.33
+mmlu_high_school_biology	acc,none	0.7548387096774194
+mmlu_high_school_chemistry	acc,none	0.6009852216748769
+mmlu_high_school_computer_science	acc,none	0.69
+mmlu_high_school_european_history	acc,none	0.7696969696969697
+mmlu_high_school_geography	acc,none	0.7272727272727273
+mmlu_high_school_government_and_politics	acc,none	0.7461139896373057
+mmlu_high_school_macroeconomics	acc,none	0.6435897435897436
+mmlu_high_school_mathematics	acc,none	0.45555555555555555
+mmlu_high_school_microeconomics	acc,none	0.7773109243697479
+mmlu_high_school_physics	acc,none	0.5165562913907285
+mmlu_high_school_psychology	acc,none	0.8
+mmlu_high_school_statistics	acc,none	0.5694444444444444
+mmlu_high_school_us_history	acc,none	0.7156862745098039
+mmlu_high_school_world_history	acc,none	0.7974683544303798
+mmlu_human_aging	acc,none	0.600896860986547
+mmlu_human_sexuality	acc,none	0.6946564885496184
+mmlu_humanities	acc,none	0.5300743889479277
+mmlu_international_law	acc,none	0.7851239669421488
+mmlu_jurisprudence	acc,none	0.7222222222222222
+mmlu_logical_fallacies	acc,none	0.6932515337423313
+mmlu_machine_learning	acc,none	0.42857142857142855
+mmlu_management	acc,none	0.6893203883495146
+mmlu_marketing	acc,none	0.8034188034188035
+mmlu_medical_genetics	acc,none	0.69
+mmlu_miscellaneous	acc,none	0.6717752234993615
+mmlu_moral_disputes	acc,none	0.5953757225433526
+mmlu_moral_scenarios	acc,none	0.2446927374301676
+mmlu_nutrition	acc,none	0.6764705882352942
+mmlu_other	acc,none	0.6269713550048278
+mmlu_philosophy	acc,none	0.6559485530546624
+mmlu_prehistory	acc,none	0.6265432098765432
+mmlu_professional_accounting	acc,none	0.4397163120567376
+mmlu_professional_law	acc,none	0.4745762711864407
+mmlu_professional_medicine	acc,none	0.6838235294117647
+mmlu_professional_psychology	acc,none	0.5915032679738562
+mmlu_public_relations	acc,none	0.6
+mmlu_security_studies	acc,none	0.7020408163265306
+mmlu_social_sciences	acc,none	0.6906077348066298
+mmlu_sociology	acc,none	0.7711442786069652
+mmlu_stem	acc,none	0.5883285759594037
+mmlu_us_foreign_policy	acc,none	0.78
+mmlu_virology	acc,none	0.45180722891566266
+mmlu_world_religions	acc,none	0.7192982456140351
+piqa	acc_norm,none	0.6322089227421109
+truthfulqa_mc2	acc,none	0.45340473177307805
+winogrande	acc,none	0.500394632991318
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,137 @@
+
+        {%- if tools %}
+            {{- '<|im_start|>system
+' }}
+            {%- if messages[0].role == 'system' %}
+                {{- messages[0].content + '
+
+' }}
+            {%- else %} 
+                {{- '你是一位工具函数调用专家，你会得到一个问题和一组可能的工具函数。根据问题，你需要进行一个或多个函数/工具调用以实现目的，请尽量尝试探索通过工具解决问题。
+如果没有一个函数可以使用，请直接使用自然语言回复用户。
+如果给定的问题缺少函数所需的参数，请使用自然语言进行提问，向用户询问必要信息。
+如果调用结果已经足够回答用户问题，请对历史结果进行总结，使用自然语言回复用户。' }} 
+            {%- endif %}
+            {{- "# Tools
+
+You may call one or more functions to assist with the user query.
+
+You are provided with function signatures within <tools></tools> XML tags:
+<tools>" }}
+            {%- for tool in tools %}
+                {{- "
+" }}
+                {{- tool | tojson }}
+            {%- endfor %}
+            {{- "
+</tools>
+
+For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
+<tool_call>
+{\"name\": <function-name>, \"arguments\": <args-json-object>}
+</tool_call><|im_end|>
+" }}
+        {%- else %}
+            {%- if messages[0].role == 'system' %}
+                {{- '<|im_start|>system
+' + messages[0].content + '<|im_end|>
+' }}
+            {%- else %} 
+                {{- '<|im_start|>system
+你是南北阁，一款由BOSS直聘自主研发并训练的专业大语言模型。<|im_end|>
+' }} 
+            {%- endif %}
+        {%- endif %}
+        {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+        {%- for message in messages[::-1] %}
+            {%- set index = (messages|length - 1) - loop.index0 %}
+            {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
+                {%- set ns.multi_step_tool = false %}
+                {%- set ns.last_query_index = index %}
+            {%- endif %}
+        {%- endfor %}
+        {%- for message in messages %}
+            {%- if message.content is string %}
+                {%- set content = message.content %}
+            {%- else %}
+                {%- set content = '' %}
+            {%- endif %}
+            {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+                {{- '<|im_start|>' + message.role + '
+' + content + '<|im_end|>' + '
+' }}
+            {%- elif message.role == "assistant" %}
+                {%- set reasoning_content = '' %}
+                {%- if message.reasoning_content is string %}
+                    {%- set reasoning_content = message.reasoning_content %}
+                {%- else %}
+                    {%- if '</think>' in content %}
+                        {%- set reasoning_content = content.split('</think>')[0].rstrip('
+').split('<think>')[-1].lstrip('
+') %}
+                        {%- set content = content.split('</think>')[-1].lstrip('
+') %}
+                    {%- endif %}
+                {%- endif %}
+                {%- if loop.index0 > ns.last_query_index or keep_all_think or (extra_body is defined and extra_body.keep_all_think) %}
+                    {%- if loop.last or (not loop.last and reasoning_content) %}
+                        {{- '<|im_start|>' + message.role + '
+<think>
+' + reasoning_content.strip('
+') + '
+</think>
+
+' + content.lstrip('
+') }}
+                    {%- else %}
+                        {{- '<|im_start|>' + message.role + '
+' + content }}
+                    {%- endif %}
+                {%- else %}
+                    {{- '<|im_start|>' + message.role + '
+' + content }}
+                {%- endif %}
+                {%- if message.tool_calls %}
+                    {%- for tool_call in message.tool_calls %}
+                        {%- if (loop.first and content) or (not loop.first) %}
+                            {{- '
+' }}
+                        {%- endif %}
+                        {%- if tool_call.function %}
+                            {%- set tool_call = tool_call.function %}
+                        {%- endif %}
+                        {{- '<tool_call>
+{"name": "' }}
+                        {{- tool_call.name }}
+                        {{- '", "arguments": ' }}
+                        {%- if tool_call.arguments is string %}
+                            {{- tool_call.arguments }}
+                        {%- else %}
+                            {{- tool_call.arguments | tojson }}
+                        {%- endif %}
+                        {{- '}
+</tool_call>' }}
+                    {%- endfor %}
+                {%- endif %}
+                {{- '<|im_end|>
+' }}
+            {%- elif message.role == "tool" %}
+                {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+                    {{- '<|im_start|>user' }}
+                {%- endif %}
+                {{- '
+<tool_response>
+' }}
+                {{- content }}
+                {{- '
+</tool_response>' }}
+                {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+                    {{- '<|im_end|>
+' }}
+                {%- endif %}
+            {%- endif %}
+        {%- endfor %}
+        {%- if add_generation_prompt %}
+            {{- '<|im_start|>assistant
+' }}
+        {%- endif %}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,32 @@
+{
+  "architectures": [
+    "LlamaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 166100,
+  "dtype": "float16",
+  "embd_pdrop": 0.0,
+  "eos_token_id": 166101,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 2560,
+  "initializer_range": 0.02,
+  "intermediate_size": 10496,
+  "max_position_embeddings": 262144,
+  "mlp_bias": false,
+  "model_type": "llama",
+  "num_attention_heads": 20,
+  "num_hidden_layers": 32,
+  "num_key_value_heads": 4,
+  "pad_token_id": 0,
+  "pretraining_tp": 1,
+  "resid_pdrop": 0.0,
+  "rms_norm_eps": 1e-05,
+  "rope_scaling": null,
+  "rope_theta": 70000000,
+  "tie_word_embeddings": false,
+  "transformers_version": "4.57.6",
+  "use_cache": true,
+  "vocab_size": 166144
+}
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,7 @@
+{
+  "_from_model_config": true,
+  "bos_token_id": 166100,
+  "eos_token_id": 166101,
+  "pad_token_id": 0,
+  "transformers_version": "4.57.6"
+}
--- a/gguf/Nanbeige4.1-3B-Q4_K_M.gguf
+++ b/gguf/Nanbeige4.1-3B-Q4_K_M.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4a5a2f9028a7ff9959b5cc08fc01228ff67b9c7d0ddaa41c086acd3c43e4210b
+size 2443112064
--- a/gguf/Nanbeige4.1-3B-Q5_K_M.gguf
+++ b/gguf/Nanbeige4.1-3B-Q5_K_M.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:171f542b60aac86574aec155af15d036e4ca4d8c44f74d42eab770d17af19339
+size 2825268864
--- a/gguf/Nanbeige4.1-3B-f16.gguf
+++ b/gguf/Nanbeige4.1-3B-f16.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:113fea20515ed173bda89873e8dc81a24839872c5ad4d06cbbb477afabe24006
+size 7871576704
--- a/model-00001-of-00002.safetensors
+++ b/model-00001-of-00002.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7ac64308cdbf331f061103bf29939acb3d8718f238f75903706de5ddae9fd16b
+size 4982284224
--- a/model-00002-of-00002.safetensors
+++ b/model-00002-of-00002.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:25ad3c5f1e8f149f0cf17555f2850072f0bbef27e4554f7cf4d26fc7931f3673
+size 2885023544
--- a/model.safetensors.index.json
+++ b/model.safetensors.index.json
@@ -0,0 +1,299 @@
+{
+  "metadata": {
+    "total_parameters": 3933637120,
+    "total_size": 7867274240
+  },
+  "weight_map": {
+    "lm_head.weight": "model-00002-of-00002.safetensors",
+    "model.embed_tokens.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.21.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.21.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.21.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.21.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
+    "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
+    "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
+    "model.norm.weight": "model-00002-of-00002.safetensors"
+  }
+}
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,33 @@
+{
+  "additional_special_tokens": [
+    "<|endoftext|>"
+  ],
+  "bos_token": {
+    "content": "<|im_start|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1d8f0326910136aca20831249220b38ce5299527647bc8c6b65404485c479740
+size 18451122
--- a/tokenizer.model
+++ b/tokenizer.model
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fb41d04798b714520a9b075727b0226538b7330254299062742c50ec8374bc36
+size 2782298
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,102 @@
+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "add_prefix_space": true,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166100": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166101": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166102": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "166103": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "166104": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "166105": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "166106": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|endoftext|>"
+  ],
+  "bos_token": "<|im_start|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "extra_special_tokens": {},
+  "legacy": true,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<unk>",
+  "sp_model_kwargs": {},
+  "spaces_between_special_tokens": false,
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}