初始化项目,由ModelHub XC社区提供模型

Model: reaperdoesntknow/Symiotic-14B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-20 15:21:08 +08:00
commit be418c02d8
25 changed files with 152471 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

147
README.md Normal file
View File

@@ -0,0 +1,147 @@
---
license: afl-3.0
datasets:
- 0xZee/dataset-CoT-Advanced-Calculus-268
language:
- en
base_model:
- Qwen/Qwen3-14B
pipeline_tag: text-generation
library_name: transformers
tags:
- qwen3
- symbiotic
- symbioticai
- llm
- Symbols
- convergentintel
---
# SymbioticLM-14B
**Model Type**: Hybrid SymbolicTransformer with Persistent Memory
**Base Model**: Qwen-14B
**Framework**: PyTorch + HuggingFace Transformers
**Purpose**: Full-scale cognitive reasoning model with self-organizing memory and generative symbolic evolution
---
## Overview
SymbioticLM-14B is a state-of-the-art 17.8 billion parameter symbolictransformer hybrid model that tightly couples high-capacity neural representation with structured symbolic cognition. Designed to match or exceed performance of top-tier LLMs in symbolic domains, it supports persistent memory, entropic recall, multi-stage symbolic routing, and self-organizing knowledge structures.
This model is ideal for advanced reasoning agents, research assistants, and symbolic math/code generation systems.
---
## Architecture Highlights
- **Backbone**: Qwen-14B transformer with rotary embeddings + FlashAttention
- **Symbolic Dim**: 8192
- **Symbolic Modules**:
- ThoughtDynamicsLNN (multi-head LSTM attention)
- LiquidThoughtProcessor
- CrystallineProcessor (DNAConv GNN)
- HelicalDNAProcessor (linear helical encoding)
- **Memory**: 4096 symbolic states in FP32, retrieved using entropy + contextual similarity
- **Dream Mode**: Background symbolic simulation for open-ended cognition
- **Router**: Intent classifier + entropy gating for processor path selection
---
## Files Included
| File | Description |
|--------------------------|----------------------------------------------------------|
| `model.bin` | Transformer weights (LFS) |
| `model.safetensors` | Memory-safe weights, optimized for loading |
| `memory.pt` | 4096-symbolic vector bank |
| `config.json` | Model and architectural metadata |
| `generation_config.json` | Top-p, temperature, decoding settings |
| `tokenizer.json` | Full tokenizer with symbolic tag support |
| `added_tokens.json` | Tags like `<D_LIM>`, `<PROOF>`, `<BY_MEASURE>`, etc. |
| `special_tokens_map.json`| Special token mapping for tokenizer |
---
## Intended Uses
- Multi-step conversational agents with true memory
- Long-form symbolic theorem generation and proof planning
- Scientific dialogue, symbolic simulations, math/code synthesis
- Reasoning in fuzzy, discontinuous, or non-smooth problem domains
---
## Limitations
- Memory requires curation and seeding for maximum utility
- Symbolic cognition is not instruction-tuned for general QA
- FlashAttention and symbolic modules increase VRAM usage during generation
---
## Citations
Please cite "SymbioticLM" when using symbolic memory components in research or applications.
---
## Convergent Intelligence Portfolio
*Part of the [Symbiotic AI Series](https://huggingface.co/reaperdoesntknow) by [Convergent Intelligence LLC: Research Division](https://huggingface.co/reaperdoesntknow)*
#
## Mathematical Foundations: Discrepancy Calculus (DISC)
SymbioticLM's persistent memory and symbolic evolution connect to Discrepancy Calculus through **self-generating completeness** (Ch. 3 of the DISC monograph) and **symbolic-root domains**. The discrepancy operator:
$$Df(x) = \lim_{\varepsilon \downarrow 0} \frac{1}{\varepsilon} \int_x^{x+\varepsilon} \frac{|f(t) - f(x)|}{|t - x|}\, dt$$
quantifies local mismatch between integration and differentiation. In the symbolic-transformer context, $D$ measures the gap between what the symbolic system encodes (discrete structure) and what the transformer integrates (continuous representation). The self-generating completeness theorem establishes that completeness emerges dynamically via energy computation on symbolic-root domains — the mathematical foundation for why symbolic-neural hybrids can produce structure that neither component generates alone.
The **discrepancy energy** $E_{\text{disc}}[f] = \frac{1}{2}\int w(x)(Df(x))^2 d\mu(x)$ provides a natural stability criterion for the memory consolidation process: memory states with bounded discrepancy energy are stable; those with divergent energy indicate structural transitions requiring reorganization.
Full theory: *"On the Formal Analysis of Discrepancy Calculus"* (Colca, 2026; Convergent Intelligence LLC: Research Division).
## Related Models
| Model | Downloads | Format |
|-------|-----------|--------|
| [Symbiotic-1B](https://huggingface.co/reaperdoesntknow/Symbiotic-1B) | 4 | HF |
| [Symbiotic-8B](https://huggingface.co/reaperdoesntknow/Symbiotic-8B) | 4 | HF |
| [Symbiotic-Beta](https://huggingface.co/reaperdoesntknow/Symbiotic-Beta) | 3 | HF |
### Top Models from Our Lab
| Model | Downloads |
|-------|-----------|
| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | 501 |
| [LFM2.5-1.2B-Distilled-SFT](https://huggingface.co/reaperdoesntknow/LFM2.5-1.2B-Distilled-SFT) | 342 |
| [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) | 302 |
| [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF) | 203 |
| [Qwen3-1.7B-Coder-Distilled-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF) | 194 |
**Total Portfolio: 49 models, 22,598 total downloads**
*Last updated: 2026-03-28 12:57 UTC*
<!-- CIX-CROSSLINK-START -->
---
## From the Convergent Intelligence Portfolio
**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Our only BF16 series. Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B on H100. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads. The rest of the portfolio proves structure beats scale on CPU. This collection shows what happens when you give the methodology real hardware.
Top model: [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) — 508 downloads
Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)
*Convergent Intelligence LLC: Research Division*
<!-- CIX-CROSSLINK-END -->

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

31
config.json Normal file
View File

@@ -0,0 +1,31 @@
{
"_attn_implementation_autoset": true,
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 17408,
"max_position_embeddings": 40960,
"max_window_layers": 40,
"model_type": "qwen3",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.51.3",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}

74
generation_config.json Normal file
View File

@@ -0,0 +1,74 @@
{
"max_length": 20,
"max_new_tokens": null,
"min_length": 0,
"min_new_tokens": null,
"early_stopping": false,
"max_time": null,
"stop_strings": null,
"do_sample": true,
"num_beams": 1,
"num_beam_groups": 1,
"penalty_alpha": null,
"dola_layers": null,
"use_cache": true,
"cache_implementation": null,
"cache_config": null,
"return_legacy_cache": null,
"prefill_chunk_size": null,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"min_p": null,
"typical_p": 1.0,
"epsilon_cutoff": 0.0,
"eta_cutoff": 0.0,
"diversity_penalty": 0.0,
"repetition_penalty": 1.0,
"encoder_repetition_penalty": 1.0,
"length_penalty": 1.0,
"no_repeat_ngram_size": 0,
"bad_words_ids": null,
"force_words_ids": null,
"renormalize_logits": false,
"constraints": null,
"forced_bos_token_id": null,
"forced_eos_token_id": null,
"remove_invalid_values": false,
"exponential_decay_length_penalty": null,
"suppress_tokens": null,
"begin_suppress_tokens": null,
"forced_decoder_ids": null,
"sequence_bias": null,
"token_healing": false,
"guidance_scale": null,
"low_memory": null,
"watermarking_config": null,
"num_return_sequences": 1,
"output_attentions": false,
"output_hidden_states": false,
"output_scores": false,
"output_logits": null,
"return_dict_in_generate": false,
"pad_token_id": 151643,
"bos_token_id": 151643,
"eos_token_id": [
151645,
151643
],
"encoder_no_repeat_ngram_size": 0,
"decoder_start_token_id": null,
"is_assistant": false,
"num_assistant_tokens": 20,
"num_assistant_tokens_schedule": "constant",
"assistant_confidence_threshold": 0.4,
"prompt_lookup_num_tokens": null,
"max_matching_ngram_size": null,
"assistant_early_exit": null,
"assistant_lookbehind": 10,
"target_lookbehind": 10,
"disable_compile": false,
"generation_kwargs": {},
"_from_model_config": false,
"transformers_version": "4.51.3"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:da4e028119ef8bc718cedc7ec554e548ac00d1b1db5c45a660530bf39b5af98e
size 4684558368

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bc7aa23c6a75cef984b3c8a71f57772e90836d1693bc364beab4f794e333d680
size 4676778920

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ad3c47b50dd243618c6ca9fca3ef5d4e096c026c4956a506ee0a22ccc438a2dc
size 4928480056

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7411117e02260a269005757dfa8b6d0f80d8e33ed7f393a27bd568d03187af73
size 4928480080

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8a2d3b7f54b24ccf24ede0fbad390c796be26e6cc3e336513d8efd52b32b304c
size 4676778952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c498722354638879b074c746816747ea280564b4a28978c158d0248b745b650a
size 4928480096

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6d40d78446eedebe75e9c9da4081ee0e3e86ac05e67bac32c2b345293e85ba2d
size 4928480096

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:df03d235247c6eaf96364d096ae49869af0b5cab9c5b00ce7843564dafdcd4b0
size 4676778952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4becc4f0b1cd604a798f2d02d917d5f16fafa4c204ac13cf464d092e42047c84
size 4928480096

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:73830f584e55f1820c9e9cee3190c8bc92fd41e14dea4eb717f7947124d13c8e
size 4928480096

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c2f032a543c44f64d73e98d08396dfa651e616b980cda719fe1958e071ad08b0
size 4676778952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3f8f74af26752a43bf2508f87bb495d73356fc052d43193afccdf9e581ed3ea7
size 2999075752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:facba960024e8c4baad69193baddfaa7616e9e0cd9d6b539a4e84c8c6d182343
size 3111649408

3
model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:31be40d2534f26f6bf225a10a4f1824e553f05eb062f985fd53a7d59da531ec0
size 15582153002

View File

@@ -0,0 +1,450 @@
{
"metadata": {
"total_size": 59073228800
},
"weight_map": {
"lm_head.weight": "model-00013-of-00013.safetensors",
"model.embed_tokens.weight": "model-00001-of-00013.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00013.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00013.safetensors",
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00013.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00013.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.1.input_layernorm.weight": "model-00002-of-00013.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00002-of-00013.safetensors",
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00013.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00013.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00013.safetensors",
"model.layers.10.input_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.10.self_attn.k_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.10.self_attn.q_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.11.input_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.11.self_attn.k_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.11.self_attn.q_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.12.input_layernorm.weight": "model-00005-of-00013.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00005-of-00013.safetensors",
"model.layers.12.self_attn.k_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.12.self_attn.q_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.13.input_layernorm.weight": "model-00005-of-00013.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00005-of-00013.safetensors",
"model.layers.13.self_attn.k_norm.weight": "model-00005-of-00013.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.13.self_attn.q_norm.weight": "model-00005-of-00013.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.14.input_layernorm.weight": "model-00005-of-00013.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00005-of-00013.safetensors",
"model.layers.14.self_attn.k_norm.weight": "model-00005-of-00013.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.14.self_attn.q_norm.weight": "model-00005-of-00013.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.15.input_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.15.self_attn.k_norm.weight": "model-00005-of-00013.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.15.self_attn.q_norm.weight": "model-00005-of-00013.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00005-of-00013.safetensors",
"model.layers.16.input_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.16.self_attn.k_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.16.self_attn.q_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.17.input_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.17.self_attn.k_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.17.self_attn.q_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.18.input_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00006-of-00013.safetensors",
"model.layers.18.self_attn.k_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.18.self_attn.q_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.19.input_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.19.self_attn.k_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.19.self_attn.q_norm.weight": "model-00006-of-00013.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00006-of-00013.safetensors",
"model.layers.2.input_layernorm.weight": "model-00002-of-00013.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00002-of-00013.safetensors",
"model.layers.2.self_attn.k_norm.weight": "model-00002-of-00013.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.2.self_attn.q_norm.weight": "model-00002-of-00013.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.20.input_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.20.self_attn.k_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.20.self_attn.q_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.21.input_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.21.self_attn.k_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.21.self_attn.q_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.22.input_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00007-of-00013.safetensors",
"model.layers.22.self_attn.k_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.22.self_attn.q_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.23.input_layernorm.weight": "model-00008-of-00013.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00008-of-00013.safetensors",
"model.layers.23.self_attn.k_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.23.self_attn.q_norm.weight": "model-00007-of-00013.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00007-of-00013.safetensors",
"model.layers.24.input_layernorm.weight": "model-00008-of-00013.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00008-of-00013.safetensors",
"model.layers.24.self_attn.k_norm.weight": "model-00008-of-00013.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.24.self_attn.q_norm.weight": "model-00008-of-00013.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.25.input_layernorm.weight": "model-00008-of-00013.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00008-of-00013.safetensors",
"model.layers.25.self_attn.k_norm.weight": "model-00008-of-00013.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.25.self_attn.q_norm.weight": "model-00008-of-00013.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.26.input_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.26.self_attn.k_norm.weight": "model-00008-of-00013.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.26.self_attn.q_norm.weight": "model-00008-of-00013.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00008-of-00013.safetensors",
"model.layers.27.input_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.27.self_attn.k_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.27.self_attn.q_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.28.input_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.28.self_attn.k_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.28.self_attn.q_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.29.input_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00009-of-00013.safetensors",
"model.layers.29.self_attn.k_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.29.self_attn.q_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00013.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00013.safetensors",
"model.layers.3.self_attn.k_norm.weight": "model-00002-of-00013.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.3.self_attn.q_norm.weight": "model-00002-of-00013.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.30.input_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.30.self_attn.k_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.30.self_attn.q_norm.weight": "model-00009-of-00013.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00009-of-00013.safetensors",
"model.layers.31.input_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.31.self_attn.k_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.31.self_attn.q_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.32.input_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.32.self_attn.k_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.32.self_attn.q_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.33.input_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00010-of-00013.safetensors",
"model.layers.33.self_attn.k_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.33.self_attn.q_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.34.input_layernorm.weight": "model-00011-of-00013.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00011-of-00013.safetensors",
"model.layers.34.self_attn.k_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.34.self_attn.q_norm.weight": "model-00010-of-00013.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00010-of-00013.safetensors",
"model.layers.35.input_layernorm.weight": "model-00011-of-00013.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00011-of-00013.safetensors",
"model.layers.35.self_attn.k_norm.weight": "model-00011-of-00013.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.35.self_attn.q_norm.weight": "model-00011-of-00013.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.36.input_layernorm.weight": "model-00011-of-00013.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00011-of-00013.safetensors",
"model.layers.36.self_attn.k_norm.weight": "model-00011-of-00013.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.36.self_attn.q_norm.weight": "model-00011-of-00013.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.37.input_layernorm.weight": "model-00012-of-00013.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00012-of-00013.safetensors",
"model.layers.37.self_attn.k_norm.weight": "model-00011-of-00013.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.37.self_attn.q_norm.weight": "model-00011-of-00013.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00011-of-00013.safetensors",
"model.layers.38.input_layernorm.weight": "model-00012-of-00013.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00012-of-00013.safetensors",
"model.layers.38.self_attn.k_norm.weight": "model-00012-of-00013.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.38.self_attn.q_norm.weight": "model-00012-of-00013.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.39.input_layernorm.weight": "model-00012-of-00013.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00012-of-00013.safetensors",
"model.layers.39.self_attn.k_norm.weight": "model-00012-of-00013.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.39.self_attn.q_norm.weight": "model-00012-of-00013.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00012-of-00013.safetensors",
"model.layers.4.input_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.4.self_attn.k_norm.weight": "model-00002-of-00013.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.4.self_attn.q_norm.weight": "model-00002-of-00013.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00013.safetensors",
"model.layers.5.input_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.5.self_attn.k_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.5.self_attn.q_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.6.input_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.6.self_attn.k_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.6.self_attn.q_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00013.safetensors",
"model.layers.7.self_attn.k_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.7.self_attn.q_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.8.input_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.8.self_attn.k_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.8.self_attn.q_norm.weight": "model-00003-of-00013.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00003-of-00013.safetensors",
"model.layers.9.input_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00004-of-00013.safetensors",
"model.layers.9.self_attn.k_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.9.self_attn.q_norm.weight": "model-00004-of-00013.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00004-of-00013.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00004-of-00013.safetensors",
"model.norm.weight": "model-00012-of-00013.safetensors"
}
}

31
special_tokens_map.json Normal file
View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
size 11422654

240
tokenizer_config.json Normal file
View File

@@ -0,0 +1,240 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set content = message.content %}\n {%- set reasoning_content = '' %}\n {%- if message.reasoning_content is defined and message.reasoning_content is not none %}\n {%- set reasoning_content = message.reasoning_content %}\n {%- else %}\n {%- if '</think>' in message.content %}\n {%- set content = message.content.split('</think>')[-1].lstrip('\\n') %}\n {%- set reasoning_content = message.content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n {%- endif %}\n {%- endif %}\n {%- if loop.index0 > ns.last_query_index %}\n {%- if loop.last or (not loop.last and reasoning_content) %}\n {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n {%- if enable_thinking is defined and enable_thinking is false %}\n {{- '<think>\\n\\n</think>\\n\\n' }}\n {%- endif %}\n{%- endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 131072,
"pad_token": "<|endoftext|>",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long