初始化项目，由ModelHub XC社区提供模型

Model: lamm-mit/Graph-Preflexor-8b_12292025 Source: Original Platform
2026-06-15 15:12:53 +08:00
commit a55156e9c3
19 changed files with 154660 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,37 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
+graph.png filter=lfs diff=lfs merge=lfs -text
--- a/Colab_graph_reasoning.ipynb
+++ b/Colab_graph_reasoning.ipynb
--- a/README.md
+++ b/README.md
@@ -0,0 +1,439 @@
+---
+library_name: transformers
+license: apache-2.0
+base_model:
+- Qwen/Qwen3-8B
+datasets:
+- lamm-mit/graph_reasoning_1K
+---
+
+# Model Summary and Training Method
+
+This model (`lamm-mit/Graph-Preflexor-8b_12292025`) was trained in two sequential stages to produce graph-native scientific reasoning with structured intermediate representations.
+
+### Training scripts
+
+Install Graph-GRPO repo:
+
+```
+git clone https://github.com/lamm-mit/graph-preflexor-grpo.git
+cd graph-preflexor-grpo
+```
+
+Training run ORPO, then Graph-GRPO:
+
+```bash
+python ./src/run_orpo_graph.py
+--base_model Qwen/Qwen3-8B
+--dataset lamm-mit/graph_reasoning_1K
+--output_dir ./orpo-graph_v40
+--epochs 1 --lr 5e-5 --batch_size 2
+--save_steps 100 --max_length 2048
+--eval_steps 100
+--push_to_hub
+--hub_model_id lamm-mit/orpo-graph
+--hf_token $HF_TOKEN
+```
+
+Test warm-start model:
+
+```bash
+python ./src/test_model.py --model ./orpo-graph
+```
+
+Graph-GRPO phase: 
+
+```bash
+python ./src/run_grpo_graph.py
+--base_model_dir lamm-mit/orpo-graph
+--dataset lamm-mit/graph_reasoning_1K
+--output_dir ./lamm-mit/Graph-Preflexor-8b_12292025
+--judge_model grok-4-1-fast-non-reasoning
+--judge_api_key $GROK_API_KEY
+--judge_base_url https://api.x.ai/v1
+--weight_correctness 0.30
+--weight_format 0.15
+--weight_graph_utility 0.25
+--weight_graph_networkx 0.10
+--weight_graph_diversity 0.10
+--weight_graph_structure 0.10
+--num_generations 8 --per_device_train_batch_size 1 --gradient_accumulation_steps 8
+--learning_rate 5e-6 --epochs 3
+--max_completion_length 3500
+--push_to_hub
+--hub_model_id lamm-mit/lamm-mit/Graph-Preflexor-8b_12292025
+--hf_token $HF_TOKEN --use_vllm --vllm_gpu_memory_utilization 0.4
+```
+### How Graph-Preflexor Reasons
+
+The model produces a structured reasoning trace with explicit “sentinel” blocks.
+Each block has a distinct role: exploration → structure → validation → synthesis.
+
+```raw
+User Prompt
+   |
+   v
+<think>  (internal reasoning container; not meant as final answer)
+   |
+   +--> <brainstorm>
+   |       Purpose: generate hypotheses, mechanisms, candidate factors,
+   |                and possible causal stories (broad search; divergent).
+   |
+   +--> <graph>
+   |       Purpose: sketch the conceptual graph verbally (entities + relations).
+   |                Think of it as the draft blueprint.
+   |
+   +--> <graph_json>
+   |       Purpose: emit a machine-readable knowledge graph:
+   |                nodes = concepts; edges = relations (source, relation, target).
+   |                This is the canonical structured representation.
+   |
+   +--> <patterns>
+   |       Purpose: compress the graph into reusable motifs:
+   |                invariants, abstractions, multi-scale regularities,
+   |                analogies, and “design rules”.
+   |
+   +--> <synthesis>
+   |       Purpose: assemble the final narrative by reading from the graph:
+   |                coherent, ordered explanation and (optionally) next steps.
+   |
+</think>
+   |
+   v
+Final Answer (post-</think>, user-facing)
+   - concise, coherent prose derived from the graph + synthesis
+   - should remain consistent with the <graph_json> content
+```
+
+Sentinel blocks used by the model: 
+
+```raw
+<think> ... </think>
+  - Container for all internal work.
+  - May include intermediate calculations, choices, and planning.
+
+<brainstorm> ... </brainstorm>
+  - Rapid hypothesis generation.
+  - Lists candidate mechanisms, variables, constraints, tradeoffs.
+  - “Wide exploration” mode.
+
+<graph> ... </graph>
+  - Human-readable graph sketch.
+  - Names the concepts and how they connect (often as bullet edges).
+
+<graph_json> ... </graph_json>
+  - Machine-readable knowledge graph:
+      {
+        "nodes": [{"id": "ConceptA"}, ...],
+        "edges": [{"source": "A", "relation": "causes", "target": "B"}, ...]
+      }
+  - Intended to be parseable and reusable for downstream tooling.
+
+<patterns> ... </patterns>
+  - Extracts higher-level structure:
+    - causal motifs (feedforward/feedback loops)
+    - modularity / hierarchy (micro→meso→macro)
+    - bottlenecks, bridges, invariants
+    - “principles” that generalize beyond the single example
+
+<synthesis> ... </synthesis>
+  - Turns structure into explanation:
+    - ordered narrative aligned with the graph
+    - explicit causal chain(s)
+    - checks for coherence / missing links
+    - may propose experiments, predictions, or design implications
+```
+
+Why this format is useful:
+
+- Interpretability: reasoning is segmented by function (explore → formalize → abstract → explain).
+- Parsability: `<graph_json>` enables programmatic extraction, visua_
+
+Detailed sampling code and CLIs described below. For quick start and interactive demo (including graph visualization), use this Colab notebook:
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](
+  https://colab.research.google.com/#fileId=https://huggingface.co/lamm-mit/Graph-Preflexor-8b_12292025/blob/main/Colab_graph_reasoning.ipynb
+)
+
+![image](https://cdn-uploads.huggingface.co/production/uploads/623ce1c6b66fedf374859fe7/QzUqb2255ruwEdoIHo6PQ.png)
+
+### 1. Base Model: ORPO Graph Reasoning
+
+The starting point was an **ORPO-trained graph-native language model**, which was based off of `Qwen/Qwen3-8B`.  
+
+In the first step, ORPO (Offline Reinforcement Preference Optimization) was used to align the model toward:
+
+- Explicit structured reasoning using tagged sections (e.g. `<think>`, `<graph>`, `<graph_json>`, `<patterns>`, `<synthesis>`)
+- Emission of valid, machine-interpretable **graph-structrured thinking** alongside natural language answers
+- Faithful separation between internal reasoning, graph construction, and final synthesis
+
+This stage established the model’s *representation language* and graph-centric reasoning style.
+
+### 2. GRPO Fine-Tuning with External Judging
+
+On top of the ORPO model, we applied **GRPO (Generative Reinforcement Preference Optimization)** using the dataset:
+
+- **Dataset:** `lamm-mit/graph_reasoning_v3`
+- **Prompts:** graph-based scientific and reasoning questions
+- **Gold answers:** reference solutions used for evaluation
+
+For each prompt, the model generated **multiple candidate completions** (`num_generations = 8`).  These were scored using an **external LLM judge** (`grok-4-1-fast-non-reasoning`) via a multi-component reward function.
+
+#### Reward Structure
+
+Each completion received a weighted reward composed of:
+
+- **Answer correctness (0.30)**  
+  How well the final answer matches the gold reference.
+
+- **Format compliance (0.15)**  
+  Presence and validity of required reasoning sections and parseable graph JSON.
+
+- **Graph utility (0.25)**  
+  Whether the emitted knowledge graph alone contains enough information to reconstruct the answer.
+
+- **Graph validity (NetworkX) (0.10)**  
+  Structural soundness of the graph (nodes, edges, connectivity).
+
+- **Graph diversity (0.10)**  
+  Semantic diversity of concepts expressed in nodes and relations.
+
+- **Graph structure quality (0.10)**  
+  Topological richness (depth, internal nodes, non-trivial connectivity).
+
+Rewards were normalized per batch and optimized using the GRPO loss.
+
+### 4. Resulting Capabilities
+
+The resulting model is optimized to:
+
+- Reason explicitly through structured, inspectable intermediate representations
+- Emit valid, analyzable knowledge graphs alongside answers
+- Encode scientific reasoning such that *graphs themselves* carry explanatory power
+- Balance correctness, interpretability, and structural richness
+
+This makes the model particularly suitable for **AI-for-science**, **graph-native reasoning**, and **knowledge discovery workflows** where transparency and structure matter as much as accuracy.
+
+# Sample Generation
+
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
+
+token= 'hf_...'
+
+# ------------------------------------------------------------------------------
+# Configuration
+# ------------------------------------------------------------------------------
+MODEL_NAME = "lamm-mit/Graph-Preflexor-8b_12292025"
+PROMPT = "Give me a short introduction to materiomics."
+MAX_NEW_TOKENS = 32_768
+THINK_END_TOKEN_ID = 151668  # </think>
+
+# ------------------------------------------------------------------------------
+# Model & Tokenizer Loading
+# ------------------------------------------------------------------------------
+tokenizer = AutoTokenizer.from_pretrained(
+    MODEL_NAME,
+    token=token,
+)
+model = AutoModelForCausalLM.from_pretrained(
+    MODEL_NAME,
+    torch_dtype="auto",
+    device_map="auto",
+    token=token,
+)
+model.eval()
+
+# ------------------------------------------------------------------------------
+# Prompt Construction
+# ------------------------------------------------------------------------------
+messages = [
+    {"role": "user", "content": PROMPT}
+]
+
+prompt_text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True,
+    enable_thinking=True,  # toggles chain-of-thought mode
+)
+
+model_inputs = tokenizer(
+    prompt_text,
+    return_tensors="pt",
+).to(model.device)
+
+# ------------------------------------------------------------------------------
+# Generation
+# ------------------------------------------------------------------------------
+gen_config = GenerationConfig(
+    max_new_tokens=MAX_NEW_TOKENS,
+    do_sample=True,  # sample 
+    temperature=0.2, 
+)
+
+with torch.no_grad():
+    generated = model.generate(
+        **model_inputs,
+        generation_config=gen_config,
+    )
+
+# Slice off the prompt tokens
+output_ids = generated[0, model_inputs.input_ids.shape[1]:].tolist()
+
+# ------------------------------------------------------------------------------
+# Thinking / Content Parsing
+# ------------------------------------------------------------------------------
+def split_thinking(output_ids, tokenizer, think_end_id):
+    """
+    Split generated tokens into (thinking, final_content) based on </think>.
+    Falls back gracefully if no thinking block is present.
+    """
+    try:
+        split_idx = len(output_ids) - output_ids[::-1].index(think_end_id)
+    except ValueError:
+        split_idx = 0
+
+    thinking = tokenizer.decode(
+        output_ids[:split_idx],
+        skip_special_tokens=True,
+    ).strip()
+
+    content = tokenizer.decode(
+        output_ids[split_idx:],
+        skip_special_tokens=True,
+    ).strip()
+
+    return thinking, content
+
+
+thinking, content = split_thinking(
+    output_ids,
+    tokenizer,
+    THINK_END_TOKEN_ID,
+)
+
+# ------------------------------------------------------------------------------
+# Output
+# ------------------------------------------------------------------------------
+print("\n" + "=" * 80)
+print("THINKING")
+print("=" * 80)
+print(thinking or "[**no thinking content detected**]")
+
+print("\n" + "=" * 80)
+print("FINAL OUTPUT")
+print("=" * 80)
+print(content)
+```
+
+Sample output:
+
+```raw
+================================================================================
+THINKING
+================================================================================
+<think>
+<brainstorm>
+Materiomics is an emerging interdisciplinary field that bridges materials science with omics approaches like genomics, proteomics, and metabolomics. Key concepts: it uses high-throughput, data-driven methods to analyze materials at atomic/molecular scales, enabling predictive design and discovery. Relevant phenomena include self-assembly, phase transitions, defect engineering, and property-structure relationships. Hypotheses: materiomics can accelerate material innovation by mimicking biological systems' efficiency (e.g., protein folding). Interactions: data from omics informs computational models (e.g., machine learning for property prediction), while materials synthesis validates these models. Broader context: part of materials informatics, relevant to nanotechnology, sustainable materials, and AI-driven R&D. Key variables: material composition, processing conditions, scale (nano/micro), omics data quality, computational power.
+</brainstorm>
+
+<graph>
+Core entities: Materiomics, Materials Science, Omics (Genomics/Proteomics/Metabolomics), High-Throughput Analysis, Predictive Modeling, Material Properties, Self-Assembly, Phase Transitions, Defect Engineering, Computational Models, Data-Driven Design, Biological Inspiration.
+Relationships: Materiomics integrates Materials Science with Omics → enables High-Throughput Analysis → feeds Predictive Modeling → predicts Material Properties. Self-Assembly, Phase Transitions, Defect Engineering → influence Material Properties. Computational Models → refine Predictive Modeling. Biological Inspiration → guides Data-Driven Design in Materiomics. High-Throughput Analysis → validates Computational Models.
+</graph>
+
+<graph_json>
+{
+  "nodes": [
+    {"id": "Materiomics"},
+    {"id": "MaterialsScience"},
+    {"id": "Omics"},
+    {"id": "HighThroughput"},
+    {"id": "PredictiveModeling"},
+    {"id": "MaterialProperties"},
+    {"id": "SelfAssembly"},
+    {"id": "PhaseTransitions"},
+    {"id": "DefectEngineering"},
+    {"id": "ComputationalModels"},
+    {"id": "DataDrivenDesign"},
+    {"id": "BiologicalInspiration"}
+  ],
+  "edges": [
+    {"source": "Materiomics", "relation": "integrates", "target": "MaterialsScience"},
+    {"source": "Materiomics", "relation": "integrates", "target": "Omics"},
+    {"source": "Omics", "relation": "enables", "target": "HighThroughput"},
+    {"source": "HighThroughput", "relation": "feeds", "target": "PredictiveModeling"},
+    {"source": "PredictiveModeling", "relation": "predicts", "target": "MaterialProperties"},
+    {"source": "SelfAssembly", "relation": "influences", "target": "MaterialProperties"},
+    {"source": "PhaseTransitions", "relation": "influences", "target": "MaterialProperties"},
+    {"source": "DefectEngineering", "relation": "influences", "target": "MaterialProperties"},
+    {"source": "ComputationalModels", "relation": "refines", "target": "PredictiveModeling"},
+    {"source": "HighThroughput", "relation": "validates", "target": "ComputationalModels"},
+    {"source": "BiologicalInspiration", "relation": "guides", "target": "DataDrivenDesign"},
+    {"source": "DataDrivenDesign", "relation": "applies", "target": "Materiomics"}
+  ]
+}
+</graph_json>
+
+<patterns>
+Materiomics: MaterialsScience + Omics → HighThroughput → PredictiveModeling → MaterialProperties (predictive pipeline).
+BiologicalInspiration → DataDrivenDesign → Materiomics (inspiration loop).
+</patterns>
+
+<synthesis>
+The graph reveals materiomics as a predictive pipeline integrating materials science with omics data, validated by high-throughput analysis and computational models to design materials with desired properties. Key insight: this data-driven approach, inspired by biological systems, accelerates discovery by quantifying structure-property relationships at atomic scales, enabling efficient self-assembly, phase control, and defect engineering. Coherent picture: materiomics transforms materials innovation from trial-and-error to informed, scalable design, leveraging omics for high-dimensional data and AI for modeling.
+</synthesis>
+</think>
+
+================================================================================
+FINAL OUTPUT
+================================================================================
+Materiomics is an interdisciplinary field that merges materials science with omics methodologies—such as genomics, proteomics, and metabolomics—to systematically analyze, design, and predict the properties of materials at atomic and molecular scales. At its core, materiomics leverages high-throughput experimental techniques and advanced computational models to generate vast datasets on material composition, structure, processing conditions, and resulting properties. These data are then used to build predictive models that can forecast material behavior under various stimuli, enabling the rational design of novel materials with tailored functionalities. Key phenomena underpinning materiomics include self-assembly processes where molecules spontaneously form ordered structures, phase transitions that dictate stability and transformation under thermal or mechanical stress, and defect engineering that manipulates imperfections to enhance properties like strength or conductivity. By drawing inspiration from biological systems—where complex materials like proteins and cell membranes emerge from simple building blocks—materiomics adopts data-driven, systems-level approaches to accelerate discovery. This field is pivotal in advancing nanotechnology, sustainable materials, and AI-driven R&D, offering a scalable framework to move beyond traditional trial-and-error methods, thereby revolutionizing industries from electronics to energy storage.
+```
+
+![image](https://cdn-uploads.huggingface.co/production/uploads/623ce1c6b66fedf374859fe7/iTrRAeKbE1GNQeA6Bugu5.png)
+
+# Sample Generation CLI
+
+```bash
+python graph_reasoning.py \
+    --model lamm-mit/Graph-Preflexor-8b_12292025 \
+    --prompt "Explain dragline silk toughness."
+```
+
+# References and Citation
+
+This model was trained based on the ideas presented in the below referenced papers.
+
+```bibtex
+@article{Buehler2025PRefLexOR,
+  author       = {Buehler, Markus J.},
+  title        = {PRefLexOR: preference-based recursive language modeling for exploratory optimization of reasoning and agentic thinking},
+  journal      = {npj Artificial Intelligence},
+  volume       = {1},
+  number       = {4},
+  year         = {2025},
+  publisher    = {Springer Nature},
+  doi          = {10.1038/s44387-025-00003-z},
+  url          = {https://doi.org/10.1038/s44387-025-00003-z},
+  issn         = {2731-990X},
+  received     = {2024-11-01},
+  accepted     = {2025-03-22},
+  published    = {2025-05-14},
+  keywords     = {Complex networks, Computational biology and bioinformatics}
+}
+
+@article{Buehler2025GraphPRefLexOR,
+  author       = {Buehler, Markus J.},
+  title        = {In Situ Graph Reasoning and Knowledge Expansion Using Graph-PRefLexOR},
+  journal      = {Advanced Intelligent Discovery},
+  year         = {2025},
+  publisher    = {Wiley},
+  doi          = {10.1002/aidi.202500006},
+  url          = {https://doi.org/10.1002/aidi.202500006},
+  note         = {Research Article, Open Access},
+  published    = {2025-06-09}
+}
+```
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,28 @@
+{
+  "</think>": 151668,
+  "</tool_call>": 151658,
+  "</tool_response>": 151666,
+  "<think>": 151667,
+  "<tool_call>": 151657,
+  "<tool_response>": 151665,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,89 @@
+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {{- messages[0].content + '\n\n' }}
+    {%- endif %}
+    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for message in messages[::-1] %}
+    {%- set index = (messages|length - 1) - loop.index0 %}
+    {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
+        {%- set ns.multi_step_tool = false %}
+        {%- set ns.last_query_index = index %}
+    {%- endif %}
+{%- endfor %}
+{%- for message in messages %}
+    {%- if message.content is string %}
+        {%- set content = message.content %}
+    {%- else %}
+        {%- set content = '' %}
+    {%- endif %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {%- set reasoning_content = '' %}
+        {%- if message.reasoning_content is string %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in content %}
+                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
+                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- if loop.index0 > ns.last_query_index %}
+            {%- if loop.last or (not loop.last and reasoning_content) %}
+                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
+            {%- else %}
+                {{- '<|im_start|>' + message.role + '\n' + content }}
+            {%- endif %}
+        {%- else %}
+            {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- endif %}
+        {%- if message.tool_calls %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<tool_call>\n{"name": "' }}
+                {{- tool_call.name }}
+                {{- '", "arguments": ' }}
+                {%- if tool_call.arguments is string %}
+                    {{- tool_call.arguments }}
+                {%- else %}
+                    {{- tool_call.arguments | tojson }}
+                {%- endif %}
+                {{- '}\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+    {%- if enable_thinking is defined and enable_thinking is false %}
+        {{- '<think>\n\n</think>\n\n' }}
+    {%- endif %}
+{%- endif %}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,68 @@
+{
+  "architectures": [
+    "Qwen3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 151643,
+  "dtype": "bfloat16",
+  "eos_token_id": 151645,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 4096,
+  "initializer_range": 0.02,
+  "intermediate_size": 12288,
+  "layer_types": [
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention",
+    "full_attention"
+  ],
+  "max_position_embeddings": 40960,
+  "max_window_layers": 36,
+  "model_type": "qwen3",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 36,
+  "num_key_value_heads": 8,
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 1000000,
+  "sliding_window": null,
+  "tie_word_embeddings": false,
+  "transformers_version": "4.57.3",
+  "use_cache": true,
+  "use_sliding_window": false,
+  "vocab_size": 151936
+}
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,13 @@
+{
+  "bos_token_id": 151643,
+  "do_sample": true,
+  "eos_token_id": [
+    151645,
+    151643
+  ],
+  "pad_token_id": 151643,
+  "temperature": 0.6,
+  "top_k": 20,
+  "top_p": 0.95,
+  "transformers_version": "4.57.3"
+}
--- a/graph.png
+++ b/graph.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:22536f238ae89a9e213ecbcc701c17bc8f7eb07806b47ecf163a14970928e984
+size 612575
--- a/graph_reasoning.py
+++ b/graph_reasoning.py
@@ -0,0 +1,588 @@
+#!/usr/bin/env python3
+"""
+graph_reasoning.py
+
+CLI runner for Graph-PRefLexOR-style models:
+- Load a user-specified HF model
+- Accept a user prompt (arg or stdin)
+- Generate with Hugging Face Transformers
+- Save prompt, rendered prompt, thinking/content/full output, and graph artifacts
+- Extract <graph_json>...</graph_json>, parse JSON, build NetworkX DiGraph
+- Render graph to PNG + SVG (Graphviz dot if available, else spring layout)
+- Robust fail-safe crash handling + atomic writes
+
+Example:
+  python graph_reasoning.py \
+    --model lamm-mit/Graph-Preflexor-8b_12292025 \
+    --prompt "Explain dragline silk toughness."
+
+Stdin prompt:
+  echo "Your prompt here" | python graph_reasoning.py --model ... --prompt -
+
+Notes:
+- If the model uses a different thinking end token, pass --think-end-token-id
+- If the model doesn't support enable_thinking in apply_chat_template, we fall back safely.
+"""
+
+import os
+import re
+import sys
+import json
+import math
+import time
+import argparse
+import logging
+from datetime import datetime
+from typing import Optional, Tuple, Any, Dict
+
+import torch
+import networkx as nx
+import matplotlib.pyplot as plt
+from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
+
+
+# ==============================================================================
+# Constants / defaults
+# ==============================================================================
+
+GRAPH_JSON_OPEN = "<graph_json>"
+GRAPH_JSON_CLOSE = "</graph_json>"
+
+
+# ==============================================================================
+# Helpers: filesystem + parsing
+# ==============================================================================
+
+def atomic_write_text(path: str, text: str) -> None:
+    """Write text atomically to avoid partial files on crash."""
+    tmp = path + ".tmp"
+    with open(tmp, "w", encoding="utf-8") as f:
+        f.write(text)
+    os.replace(tmp, path)
+
+
+def atomic_write_bytes(path: str, data: bytes) -> None:
+    """Atomic binary write."""
+    tmp = path + ".tmp"
+    with open(tmp, "wb") as f:
+        f.write(data)
+    os.replace(tmp, path)
+
+
+def safe_json_loads(s: str) -> Optional[Any]:
+    """Best-effort JSON parsing."""
+    try:
+        return json.loads(s)
+    except Exception:
+        return None
+
+
+def now_run_id() -> str:
+    return datetime.now().strftime("%Y%m%d_%H%M%S")
+
+
+def resolve_prompt(prompt_arg: str) -> str:
+    """
+    Resolve prompt from:
+      - literal string
+      - '-' meaning read stdin fully
+      - '@path' meaning read prompt from file
+    """
+    if prompt_arg == "-":
+        return sys.stdin.read().strip()
+    if prompt_arg.startswith("@"):
+        path = prompt_arg[1:]
+        with open(path, "r", encoding="utf-8") as f:
+            return f.read().strip()
+    return prompt_arg
+
+
+def split_thinking_by_token_id(
+    output_ids: list,
+    tokenizer,
+    think_end_id: Optional[int],
+) -> Tuple[str, str]:
+    """
+    Split generated token ids into (thinking, final_content) based on think_end_id.
+    If think_end_id is None or not found, returns ("", decoded_all) as a safe fallback.
+    """
+    if think_end_id is None:
+        return "", tokenizer.decode(output_ids, skip_special_tokens=True).strip()
+
+    try:
+        # Find first occurrence of think_end_id
+        idx = output_ids.index(think_end_id) + 1
+    except ValueError:
+        idx = 0
+
+    thinking = tokenizer.decode(output_ids[:idx], skip_special_tokens=True).strip()
+    content = tokenizer.decode(output_ids[idx:], skip_special_tokens=True).strip()
+    return thinking, content
+
+
+def extract_graph_json_block(text: str) -> Tuple[Optional[str], Optional[dict]]:
+    """
+    Extract first <graph_json>...</graph_json> block.
+    Returns (raw_json_text, parsed_obj) or (None, None).
+
+    Fail-safe recovery:
+      - try parsing inner content
+      - else take largest {...} region inside tag block
+    """
+    m = re.search(
+        rf"{re.escape(GRAPH_JSON_OPEN)}(.*?){re.escape(GRAPH_JSON_CLOSE)}",
+        text,
+        flags=re.DOTALL,
+    )
+    if not m:
+        return None, None
+
+    inner = m.group(1).strip()
+
+    obj = safe_json_loads(inner)
+    if obj is not None and isinstance(obj, dict):
+        return inner, obj
+
+    i1 = inner.find("{")
+    i2 = inner.rfind("}")
+    if i1 != -1 and i2 != -1 and i2 > i1:
+        candidate = inner[i1 : i2 + 1].strip()
+        obj2 = safe_json_loads(candidate)
+        if obj2 is not None and isinstance(obj2, dict):
+            return candidate, obj2
+
+    return inner, None
+
+
+# ==============================================================================
+# Graph utilities
+# ==============================================================================
+
+def build_nx_graph(graph_obj: Dict[str, Any]) -> nx.DiGraph:
+    """
+    Build a NetworkX DiGraph from JSON:
+      graph_obj["nodes"] = [{"id": "...", ...}, ...]
+      graph_obj["edges"] = [{"source":"...", "target":"...", "relation":"...", ...}, ...]
+    """
+    G = nx.DiGraph()
+
+    nodes = graph_obj.get("nodes", []) or []
+    edges = graph_obj.get("edges", []) or []
+
+    for n in nodes:
+        if not isinstance(n, dict):
+            continue
+        nid = n.get("id")
+        if nid:
+            attrs = {k: v for k, v in n.items() if k != "id"}
+            G.add_node(nid, **attrs)
+
+    for e in edges:
+        if not isinstance(e, dict):
+            continue
+        src = e.get("source")
+        tgt = e.get("target")
+        if not (src and tgt):
+            continue
+        rel = e.get("relation", "")
+        attrs = {k: v for k, v in e.items() if k not in ("source", "target")}
+        attrs["relation"] = rel
+
+        if src not in G:
+            G.add_node(src)
+        if tgt not in G:
+            G.add_node(tgt)
+        G.add_edge(src, tgt, **attrs)
+
+    return G
+
+
+def layout_graph(G: nx.DiGraph):
+    """
+    Prefer Graphviz 'dot' layout if available; else spring layout.
+    """
+    try:
+        from networkx.drawing.nx_pydot import graphviz_layout
+        pos = graphviz_layout(G, prog="dot")
+        return pos, "graphviz(dot)"
+    except Exception:
+        pos = nx.spring_layout(G, seed=7, k=0.9)
+        return pos, "spring_layout"
+
+
+def visualize_and_save_graph(G: nx.DiGraph, out_dir: str, title: str, log: logging.Logger):
+    """
+    Render and save PNG + SVG with edge relation labels.
+    Fail-safe: saves a minimal plot if something fails.
+    """
+    png_path = os.path.join(out_dir, "graph.png")
+    svg_path = os.path.join(out_dir, "graph.svg")
+
+    if G.number_of_nodes() == 0:
+        log.warning("Graph has 0 nodes; skipping visualization.")
+        return None, None
+
+    pos, layout_used = layout_graph(G)
+    log.info(f"Graph layout: {layout_used} | nodes={G.number_of_nodes()} edges={G.number_of_edges()}")
+
+    n = G.number_of_nodes()
+    fig_w = min(22, max(12, 0.9 * math.sqrt(n) * 8))
+    fig_h = min(12, max(7, 0.6 * math.sqrt(n) * 6))
+
+    plt.figure(figsize=(fig_w, fig_h))
+    try:
+        nx.draw_networkx_nodes(G, pos, node_size=2200, linewidths=1.2)
+        nx.draw_networkx_edges(G, pos, arrows=True, arrowstyle="-|>", arrowsize=18, width=1.6)
+        nx.draw_networkx_labels(G, pos, font_size=10)
+
+        edge_labels = {(u, v): (d.get("relation") or "") for u, v, d in G.edges(data=True)}
+        nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_size=9, rotate=False)
+
+        plt.title(f"{title} ({layout_used})")
+        plt.axis("off")
+        plt.tight_layout()
+        plt.savefig(png_path, dpi=300, bbox_inches="tight")
+        plt.savefig(svg_path, bbox_inches="tight")
+        plt.close()
+        return png_path, svg_path
+
+    except Exception as e:
+        log.exception(f"Visualization failed (attempting minimal save): {e}")
+        plt.clf()
+        plt.figure(figsize=(12, 7))
+        nx.draw(G, with_labels=True)
+        plt.title(f"{title} (minimal)")
+        plt.axis("off")
+        plt.tight_layout()
+        plt.savefig(png_path, dpi=200, bbox_inches="tight")
+        plt.savefig(svg_path, bbox_inches="tight")
+        plt.close()
+        return png_path, svg_path
+
+
+# ==============================================================================
+# Tokenizer / prompt template compatibility
+# ==============================================================================
+
+def render_chat_prompt(tokenizer, user_prompt: str, enable_thinking: bool, log: logging.Logger) -> str:
+    """
+    Render prompt using chat template when available.
+    - Tries enable_thinking=True if requested.
+    - Falls back to enable_thinking=False.
+    - Falls back to a minimal plain prompt if apply_chat_template fails.
+    """
+    messages = [{"role": "user", "content": user_prompt}]
+
+    if hasattr(tokenizer, "apply_chat_template"):
+        # Try with enable_thinking if requested
+        if enable_thinking:
+            try:
+                return tokenizer.apply_chat_template(
+                    messages,
+                    tokenize=False,
+                    add_generation_prompt=True,
+                    enable_thinking=True,
+                )
+            except TypeError as e:
+                # Some tokenizers don't accept enable_thinking kwarg
+                log.warning(f"Tokenizer chat template does not support enable_thinking kwarg: {e}")
+            except Exception as e:
+                log.warning(f"apply_chat_template(enable_thinking=True) failed; falling back: {e}")
+
+        # Try without enable_thinking
+        try:
+            return tokenizer.apply_chat_template(
+                messages,
+                tokenize=False,
+                add_generation_prompt=True,
+            )
+        except Exception as e:
+            log.warning(f"apply_chat_template failed; falling back to plain prompt: {e}")
+
+    # Plain prompt fallback
+    return user_prompt.strip()
+
+
+# ==============================================================================
+# Main
+# ==============================================================================
+
+def parse_args() -> argparse.Namespace:
+    p = argparse.ArgumentParser(
+        description="CLI Graph Reasoning Runner (Graph-PRefLexOR style): generate, extract <graph_json>, visualize.",
+        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
+    )
+
+    # Model/token/auth
+    p.add_argument("--model", required=True, help="Hugging Face model name or local path")
+    p.add_argument("--hf-token", default=None, help="HF token (or set HF_TOKEN env var)")
+    p.add_argument("--revision", default=None, help="Model revision (branch/tag/commit)")
+
+    # Prompt
+    p.add_argument(
+        "--prompt",
+        required=True,
+        help="Prompt text, or '-' for stdin, or '@path' to read from file",
+    )
+    p.add_argument(
+        "--enable-thinking",
+        action="store_true",
+        help="Attempt to enable thinking via tokenizer.apply_chat_template(enable_thinking=True)",
+    )
+
+    # Generation
+    p.add_argument("--max-new-tokens", type=int, default=32768)
+    p.add_argument("--temperature", type=float, default=0.2)
+    p.add_argument("--do-sample", action="store_true", help="Enable sampling")
+    p.add_argument("--top-p", type=float, default=None, help="Optional top_p")
+    p.add_argument("--top-k", type=int, default=None, help="Optional top_k")
+    p.add_argument("--repetition-penalty", type=float, default=None, help="Optional repetition penalty")
+
+    # Thinking split
+    p.add_argument(
+        "--think-end-token-id",
+        type=int,
+        default=None,
+        help="Token id marking end of thinking (e.g., 151668). If unset, no splitting occurs.",
+    )
+
+    # Output
+    p.add_argument("--out-dir", default=None, help="Output directory (default: ./run_<timestamp>)")
+    p.add_argument("--run-id", default=None, help="Optional custom run id (default: timestamp)")
+    p.add_argument("--print-thinking", action="store_true", help="Also print the thinking section to stdout")
+    p.add_argument("--no-print", action="store_true", help="Do not print model output to stdout")
+
+    # Performance/device
+    p.add_argument("--dtype", default="auto", choices=["auto", "float16", "bfloat16", "float32"], help="torch_dtype")
+    p.add_argument("--device-map", default="auto", help="Transformers device_map (e.g., auto, cuda:0, cpu)")
+    p.add_argument("--attn-impl", default=None, help="Optional attn_implementation (e.g., flash_attention_2)")
+
+    return p.parse_args()
+
+
+def setup_outdir(run_id: str, out_dir_arg: Optional[str]) -> str:
+    if out_dir_arg:
+        out_dir = os.path.abspath(out_dir_arg)
+    else:
+        out_dir = os.path.abspath(f"./run_{run_id}")
+    os.makedirs(out_dir, exist_ok=True)
+    return out_dir
+
+
+def setup_logger(out_dir: str) -> logging.Logger:
+    log_path = os.path.join(out_dir, "run.log")
+    logger = logging.getLogger("graph_reasoning")
+    logger.setLevel(logging.INFO)
+    logger.handlers = []  # avoid duplicate handlers in repeated runs
+
+    fmt = logging.Formatter("%(asctime)s | %(levelname)s | %(message)s")
+    fh = logging.FileHandler(log_path)
+    fh.setFormatter(fmt)
+    sh = logging.StreamHandler(sys.stdout)
+    sh.setFormatter(fmt)
+
+    logger.addHandler(fh)
+    logger.addHandler(sh)
+    return logger
+
+
+def torch_dtype_from_arg(dtype: str):
+    if dtype == "auto":
+        return "auto"
+    if dtype == "float16":
+        return torch.float16
+    if dtype == "bfloat16":
+        return torch.bfloat16
+    if dtype == "float32":
+        return torch.float32
+    return "auto"
+
+
+def main() -> int:
+    args = parse_args()
+
+    run_id = args.run_id or now_run_id()
+    out_dir = setup_outdir(run_id, args.out_dir)
+    log = setup_logger(out_dir)
+
+    hf_token = args.hf_token or os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACE_TOKEN")
+
+    # Persist run metadata early
+    meta = {
+        "run_id": run_id,
+        "timestamp": datetime.now().isoformat(),
+        "model": args.model,
+        "revision": args.revision,
+        "max_new_tokens": args.max_new_tokens,
+        "temperature": args.temperature,
+        "do_sample": bool(args.do_sample),
+        "top_p": args.top_p,
+        "top_k": args.top_k,
+        "repetition_penalty": args.repetition_penalty,
+        "think_end_token_id": args.think_end_token_id,
+        "enable_thinking": bool(args.enable_thinking),
+        "dtype": args.dtype,
+        "device_map": args.device_map,
+        "attn_impl": args.attn_impl,
+        "python": sys.version,
+        "torch": getattr(torch, "__version__", None),
+    }
+    atomic_write_text(os.path.join(out_dir, "run_meta.json"), json.dumps(meta, indent=2))
+
+    # Resolve prompt
+    prompt = resolve_prompt(args.prompt)
+    if not prompt:
+        log.error("Prompt is empty.")
+        return 2
+
+    atomic_write_text(os.path.join(out_dir, "prompt.txt"), prompt)
+
+    log.info(f"Output dir: {out_dir}")
+    log.info(f"Model: {args.model}")
+    if args.revision:
+        log.info(f"Revision: {args.revision}")
+    log.info("Loading tokenizer/model...")
+
+    # Load tokenizer/model
+    tok_kwargs = {"token": hf_token} if hf_token else {}
+    if args.revision:
+        tok_kwargs["revision"] = args.revision
+
+    tokenizer = AutoTokenizer.from_pretrained(args.model, **tok_kwargs)
+
+    model_kwargs = {
+        "device_map": args.device_map,
+        "token": hf_token if hf_token else None,
+    }
+    if args.revision:
+        model_kwargs["revision"] = args.revision
+
+    td = torch_dtype_from_arg(args.dtype)
+    if td != "auto":
+        model_kwargs["torch_dtype"] = td
+    else:
+        model_kwargs["torch_dtype"] = "auto"
+
+    if args.attn_impl:
+        model_kwargs["attn_implementation"] = args.attn_impl
+
+    model = AutoModelForCausalLM.from_pretrained(args.model, **model_kwargs)
+    model.eval()
+
+    # Render chat prompt
+    rendered = render_chat_prompt(tokenizer, prompt, enable_thinking=args.enable_thinking, log=log)
+    atomic_write_text(os.path.join(out_dir, "prompt_rendered.txt"), rendered)
+
+    # Tokenize
+    model_inputs = tokenizer(rendered, return_tensors="pt")
+
+    # Move inputs to model device where possible
+    try:
+        model_inputs = {k: v.to(model.device) for k, v in model_inputs.items()}
+    except Exception:
+        # In some device_map setups, model.device may not be meaningful; leave as-is.
+        pass
+
+    # Generation config
+    gen_cfg_kwargs = dict(
+        max_new_tokens=args.max_new_tokens,
+        do_sample=bool(args.do_sample),
+        temperature=float(args.temperature),
+    )
+    if args.top_p is not None:
+        gen_cfg_kwargs["top_p"] = float(args.top_p)
+    if args.top_k is not None:
+        gen_cfg_kwargs["top_k"] = int(args.top_k)
+    if args.repetition_penalty is not None:
+        gen_cfg_kwargs["repetition_penalty"] = float(args.repetition_penalty)
+
+    gen_config = GenerationConfig(**gen_cfg_kwargs)
+
+    log.info("Generating...")
+    t0 = time.time()
+    with torch.no_grad():
+        generated = model.generate(**model_inputs, generation_config=gen_config)
+    t1 = time.time()
+    log.info(f"Generation done in {t1 - t0:.2f}s")
+
+    # Slice off prompt tokens to get only generated continuation
+    input_len = model_inputs["input_ids"].shape[1]
+    output_ids = generated[0, input_len:].tolist()
+
+    thinking, content = split_thinking_by_token_id(output_ids, tokenizer, args.think_end_token_id)
+
+    # Persist outputs (always)
+    atomic_write_text(os.path.join(out_dir, "thinking.txt"), thinking or "")
+    atomic_write_text(os.path.join(out_dir, "content.txt"), content or "")
+    atomic_write_text(os.path.join(out_dir, "full_output.txt"), (thinking + "\n\n" + content).strip())
+
+    # Print
+    if not args.no_print:
+        if args.print_thinking and thinking:
+            sys.stdout.write("\n" + "=" * 80 + "\nTHINKING\n" + "=" * 80 + "\n")
+            sys.stdout.write(thinking + "\n")
+        sys.stdout.write("\n" + "=" * 80 + "\nFINAL OUTPUT\n" + "=" * 80 + "\n")
+        sys.stdout.write(content + "\n")
+        sys.stdout.flush()
+
+    # Extract graph json
+    raw_block, graph_obj = extract_graph_json_block((thinking or "") + "\n" + (content or ""))
+
+    if raw_block is None:
+        log.warning("No <graph_json>...</graph_json> block found in output.")
+        atomic_write_text(os.path.join(out_dir, "graph_status.txt"), "not_found")
+        return 0
+
+    atomic_write_text(os.path.join(out_dir, "graph_json_raw.txt"), raw_block)
+
+    if graph_obj is None:
+        log.warning("Found <graph_json> block, but JSON parsing failed. Saved raw block for inspection.")
+        atomic_write_text(os.path.join(out_dir, "graph_status.txt"), "found_but_parse_failed")
+        return 0
+
+    atomic_write_text(os.path.join(out_dir, "graph.json"), json.dumps(graph_obj, indent=2, ensure_ascii=False))
+    atomic_write_text(os.path.join(out_dir, "graph_status.txt"), "parsed_ok")
+
+    # Build & visualize graph
+    G = build_nx_graph(graph_obj)
+    atomic_write_text(
+        os.path.join(out_dir, "graph_stats.json"),
+        json.dumps(
+            {"nodes": G.number_of_nodes(), "edges": G.number_of_edges()},
+            indent=2,
+        ),
+    )
+
+    png_path, svg_path = visualize_and_save_graph(G, out_dir, title="Graph Reasoning Output Graph", log=log)
+    if png_path and svg_path:
+        log.info(f"Saved graph: {png_path}")
+        log.info(f"Saved graph: {svg_path}")
+
+    return 0
+
+
+if __name__ == "__main__":
+    # Hard fail-safe: always write CRASH marker if something bubbles up
+    _run_id = None
+    _out_dir = None
+    _log = None
+    try:
+        rc = main()
+        raise SystemExit(rc)
+    except SystemExit:
+        raise
+    except Exception as e:
+        # Best-effort to write crash marker if we can infer out_dir from args
+        try:
+            # Minimal heuristic: if user passed --out-dir use that; else default to latest run_* in cwd
+            # (We do not attempt to re-parse args fully here to avoid cascading failures.)
+            candidates = []
+            for name in os.listdir("."):
+                if name.startswith("run_") and os.path.isdir(name):
+                    candidates.append(name)
+            candidates.sort(reverse=True)
+            fallback_dir = os.path.abspath(candidates[0]) if candidates else os.path.abspath("./")
+            atomic_write_text(os.path.join(fallback_dir, "CRASH.txt"), repr(e))
+        except Exception:
+            pass
+        raise
--- a/merges.txt
+++ b/merges.txt
--- a/model-00001-of-00004.safetensors
+++ b/model-00001-of-00004.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:23394487aff9fa69558a6ee37df9494f3d239690678f0b58b3eead55f3fab284
+size 4902257696
--- a/model-00002-of-00004.safetensors
+++ b/model-00002-of-00004.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:40aff3e148194698d8333cb0c823c3b5a577368090ef7c275204996e0090ad94
+size 4915960368
--- a/model-00003-of-00004.safetensors
+++ b/model-00003-of-00004.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:376359c567f73634c75a6b24b5f3375c56c35c40532ab97ee175c8d6d5c80dfc
+size 4983068496
--- a/model-00004-of-00004.safetensors
+++ b/model-00004-of-00004.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6f54fc8d407650f4627a02e270e22f22cac0d909ce98786e8ce010b0d0d4ab40
+size 1580230264
--- a/model.safetensors.index.json
+++ b/model.safetensors.index.json
@@ -0,0 +1,407 @@
+{
+  "metadata": {
+    "total_parameters": 8190735360,
+    "total_size": 16381470720
+  },
+  "weight_map": {
+    "lm_head.weight": "model-00004-of-00004.safetensors",
+    "model.embed_tokens.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.22.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.22.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.33.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.input_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.34.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.self_attn.k_norm.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.self_attn.q_norm.weight": "model-00004-of-00004.safetensors",
+    "model.layers.35.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.35.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
+    "model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.input_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
+    "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
+    "model.layers.9.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.9.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
+    "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
+    "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
+    "model.norm.weight": "model-00004-of-00004.safetensors"
+  }
+}
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,31 @@
+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,239 @@
+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151665": {
+      "content": "<tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151666": {
+      "content": "</tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151667": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151668": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "model_max_length": 131072,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}
--- a/vocab.json
+++ b/vocab.json