初始化项目,由ModelHub XC社区提供模型

Model: BennyDaBall/Qwen3-4b-Z-Image-Engineer-V4
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-17 22:40:26 +08:00
commit 6a286e5da8
24 changed files with 152533 additions and 0 deletions

46
.gitattributes vendored Normal file
View File

@@ -0,0 +1,46 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-F16.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3-4b-Z-Image-Engineer-V4-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
safetensors/tokenizer.json filter=lfs diff=lfs merge=lfs -text

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:65478d4f3c0a14da58c5a07bc5ca13c7c40518b627fbdeb8fdee787405ac4014
size 8051284800

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:81890ffcf7a650efb0bceaf94c5a2d8d8204611e238cca78a71084c758a97837
size 1669499200

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3d4fe3d15adc3c60ceab3fbb04e2cfd8976aad79bb5e9b50892818e684772a4a
size 2239785280

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f7895545f58fdf1e1a508914be0f48481e953e2178b8b5211cb46d282262401f
size 2075617600

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:02e05a8e6b1dc87bfe8ea5d6ccf2dada4de0d9e0f856184e6f09c0f03f854cb1
size 2497280320

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6addf2607a0f0e2978898b928c67d062f1c5518b2f44aa01e9d6e2961afa2d60
size 2383309120

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ea735e3cbe3a9f9e2083f17e43fa2ba573e56b7c9768627bce2b60ee7eb8044d
size 2889513280

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d0d7b140332e3fef3da7032254f50837c0164069c674734e24ad6fba4551586e
size 2823711040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:41451b3d451112f3d3cbae8fb5f3ab7c6ba5a38292a8d69896883262a18635f1
size 3306260800

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:be7b7285f6b80daef5b15affbe96d6626c308ef53dae878568b36664099c71d0
size 4280404800

173
README.md Normal file
View File

@@ -0,0 +1,173 @@
---
license: apache-2.0
language:
- en
library_name: transformers
tags:
- text-generation
- prompt-engineering
- image-generation
- qwen3
- gguf
pipeline_tag: text-generation
---
# 🚀 Z-Engineer V4 (4B)
UPDATE: 6/6/26 **Z-Image-Engineer V6 is live.**
V4 still rocks as the original Qwen3 prompt-engineer build. The new V6 release is the native Z-Image Turbo text-encoder build, rebuilt with SMART DoRA and trained locally for ComfyUI drop-in use plus LM Studio prompt enhancement.
- **Main V6 repo (HF safetensors):** [Z-Image-Engineer-V6](https://huggingface.co/BennyDaBall/Z-Image-Engineer-V6)
- **V6 GGUF quants:** [Z-Image-Engineer-V6-GGUF](https://huggingface.co/BennyDaBall/Z-Image-Engineer-V6-GGUF)
GGUFs live in their own repo for clean downloads. If you want the newest Z-Image Engineer, go V6.
The **Z-Engineer** returns — now with a PhD in "not being mid."
This is Z-Engineer V4, the culmination of extensive research into what makes an AI prompt engineer actually *good* at its job. Built on the Qwen 3 architecture and trained using a novel **SMART Training** methodology, this 4B parameter model doesn't just describe scenes—it understands the craft of visual storytelling down to the lens flare.
---
## 🧠 What is this?
Z-Engineer V4 is a fully fine-tuned (not LoRA, we went *all in*) version of the text encoder from [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo). It's been specifically trained to understand the nuances of AI Image Generation workflows.
**It excels at:**
- **Expanding Concepts**: Turn "sad robot in rain" into a cinematic fever dream with chromatic aberration, shallow depth of field, and a melancholic color grade that would make Blade Runner jealous.
- **Technical Precision**: It knows the difference between an 85mm portrait lens and a 24mm wide—and will use them appropriately. Lighting? Rembrandt, split, volumetric fog? It's got opinions.
- **Stylistic Consistency**: It writes with a creative voice, not that robotic "hyperrealistic, 8k, trending on artstation" energy.
---
## 🔑 Key Use Cases
-**Prompt Enhancement**: A low-VRAM powerhouse for turning your braindead 3AM ideas into detailed visual narratives.
- 🔌 **Z-Image Turbo Encoder**: Fully backwards compatible as a drop-in CLIP text encoder for Z-Image Turbo workflows producing varied and unique results from the same seed.
- 🛡️ **Local & Private**: Runs entirely on your machine. No API fees, no data logging, no corporate overlords judging your prompts.
-**Hybrid Power**: Use it to expand a prompt, then use the model itself as the encoder for generation. It's turtles all the way down.
---
## 🧬 What's New in V4: SMART Training
This version introduces **SMART Training** (Smart Mode with Adaptive Regularization Topologer)—a custom training methodology that goes beyond standard cross-entropy optimization.
**The secret sauce?** Four auxiliary regularizers that operate on hidden states, logits, and weight matrices:
| Regularizer | What It Does | Why It Matters |
|-------------|--------------|----------------|
| **Entropic** | Prevents mode collapse, encourages diversity | No more repetitive "cinematic, 8k, masterpiece" loops |
| **Holographic** | Enforces depth-wise information compression | Clean feature hierarchy from surface to abstract |
| **Topological** | Encourages coherent latent trajectories | Prompts flow logically instead of word salad |
| **Manifold** | Stabilizes weight distributions | Rock-solid training dynamics |
**The result?** A model that generalizes better, outputs more varied responses, and doesn't collapse into repetitive patterns even after 55,000 training examples.
---
## 📉 Key Improvements Over V2.5
- **Full Fine-Tune**: V2.5 was a merged LoRA. V4 is a *full parameter fine-tune*—every single weight has been updated.
- **Bigger Dataset**: Trained on **55,000 examples** (vs 34,678 for V2.5)—60% more data.
- **SMART Regularization**: Novel training methodology that actively prevents the failure modes that plagued earlier versions.
- **Longer Training**: 7,500+ optimizer steps with extensive validation checkpointing.
- **Loss Reduction**: 55% decrease in validation loss (2.80 → 1.27) compared to baseline.
---
## 🔌 ComfyUI Integration (Recommended)
I have a custom node for seamless integration with ComfyUI:
- **Features**: Optimized for local OpenAI API compatible backends (LM Studio, Ollama, etc.)
- **Get it here**: [ComfyUI-Z-Engineer](https://github.com/BennyDaBall930/ComfyUI-Z-Engineer)
---
## 📝 Recommended System Prompt
For best results, use this system prompt:
> Interpret the user seed as production intent, then build a definitive 200-250 word single-paragraph image prompt that preserves every explicit constraint while intelligently expanding missing details. First infer the core subject, action, setting, and emotional tone; treat these as non-negotiable anchors. Then enhance with precise visual staging (explicit foreground, midground, background), clear visual hierarchy and eye path, physically plausible lighting (source, direction, softness, color temperature), and optical strategy (if lens/aperture are provided, preserve exactly; if absent, choose fitting lens and aperture and imply their depth-of-field effect). Integrate organic, manufactured, and environmental textures with realistic material behavior, add motion/atmospheric cues only when they support the scene, and apply a coherent color grade consistent with mood and environment. Keep the prose vivid but controlled: no contradictions, no overstuffing, no generic filler. Do not mention camera body brands. Output one polished paragraph only, no bullets, no line breaks, no meta commentary.
---
## 💻 Training Facts
I believe in open science. Here's exactly how this was built:
**Hardware:**
- Trained locally on an AMD Strix Halo system (Ryzen AI Max+ 395, 128GB Unified RAM)
- AMD Radeon 8060S Graphics (ROCm/HIP)
**Dataset:**
- Size: **55,000** high-quality examples
- **25,000 Vision-Grounded Samples**: Real professional photographs transcribed into the training format using [Qwen3-VL-30B-A3B](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct)—teaching the model what *actually good* cinematography looks like
- **30,000 Synthetic Samples**: Generated prompt enhancement pairs for diverse concept coverage
- Content: Curated mix teaching the model to extrapolate seed concepts into cinematic prompts grounded in real photographic technique
**Training Configuration:**
| Parameter | Value |
|-----------|-------|
| Method | Full Fine-Tune (not LoRA) |
| Base Model | [Qwen3-4b-Z-Image-Turbo-AbliteratedV1](https://huggingface.co/BennyDaBall/Qwen3-4b-Z-Image-Turbo-AbliteratedV1) |
| Optimizer Steps | 7,500+ |
| Batch Size | 2 × 8 accumulation = 16 effective |
| Learning Rate | 1e-5 (cosine decay with 5% warmup) |
| Precision | BFloat16 |
| Sequence Length | 640 tokens |
| Total Training Time | ~90 hours |
---
## 📦 GGUF & Quantization
I provide a full suite of GGUF quantizations for use with `llama.cpp`, Ollama, and LM Studio:
| Quantization | Size | Notes |
|--------------|------|-------|
| F16 | 8.0 GB | Full precision, maximum quality |
| Q8_0 | 4.3 GB | Near-lossless, recommended for most users |
| Q6_K | 3.3 GB | Great balance of quality and size |
| Q5_K_M | 2.9 GB | Good quality, smaller footprint |
| Q5_K_S | 2.8 GB | Slightly smaller Q5 variant |
| Q4_K_M | 2.5 GB | Solid 4-bit, good for VRAM-limited setups |
| Q4_K_S | 2.4 GB | Smaller 4-bit variant |
| Q3_K_L | 2.2 GB | Lower quality 3-bit, for the desperate |
| Q3_K_M | 2.1 GB | Medium 3-bit |
| Q2_K | 1.7 GB | Emergency-only tier. But it exists! |
---
## 🎯 Quick Start
**With Ollama:**
```bash
ollama run BennyDaBall/Qwen3-4b-Z-Image-Engineer-V4
```
**With LM Studio:**
1. Download the GGUF of your choice
2. Load it in LM Studio
3. Use the ComfyUI node or chat directly
---
## ⚠️ Disclaimer
This model generates text for image prompts. While I have filtered the dataset to the best of my ability, users should exercise their own judgment. I am not responsible for the content you generate.
Also, if you use this to generate prompts for images that get you in trouble, that's a *you* problem. The model is just vibing.
---
## 🙏 Acknowledgements
- **Qwen Team** for the excellent base architecture
- **Tongyi-MAI** for Z-Image-Turbo
- The **open source AI community** for making this kind of work possible
- My electricity bill, which now classifies me as a small industrial facility
---
*Built with ❤️ and way too much GPU time by BennyDaBall*

View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

View File

@@ -0,0 +1,92 @@
{%- if enable_thinking is not defined %}
{%- set enable_thinking = false %}
{%- endif %}
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if message.content is string %}
{%- set content = message.content %}
{%- else %}
{%- set content = '' %}
{%- endif %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is string %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and reasoning_content) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

72
safetensors/config.json Normal file
View File

@@ -0,0 +1,72 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 151643,
"dtype": "bfloat16",
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 2560,
"initializer_range": 0.02,
"intermediate_size": 9728,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 36,
"model_type": "qwen3",
"num_attention_heads": 32,
"num_hidden_layers": 36,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-06,
"rope_parameters": {
"rope_theta": 1000000,
"rope_type": "default"
},
"rope_scaling": null,
"rope_theta": 10000.0,
"sliding_window": null,
"tie_word_embeddings": true,
"transformers_version": "4.57.6",
"use_cache": false,
"use_sliding_window": false,
"vocab_size": 151936
}

View File

@@ -0,0 +1,13 @@
{
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.57.6"
}

151388
safetensors/merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c26813b99799e07c2cd031e0787b43478fb9a7570594d9b3c9ec9543eb19bf40
size 4967215360

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2b003d30f5095929df9b67f3d663e066481bc386a9b57de5e5cae136bb74e096
size 3077766632

View File

@@ -0,0 +1,406 @@
{
"metadata": {
"total_parameters": 4022468096,
"total_size": 8044936192
},
"weight_map": {
"model.embed_tokens.weight": "model-00001-of-00002.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.20.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.28.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.28.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.28.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.29.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.29.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.30.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.30.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.31.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.31.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.32.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.32.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.32.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.33.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.33.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.33.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.34.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.34.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.35.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.35.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.norm.weight": "model-00002-of-00002.safetensors"
}
}

View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:79cb3c783570f1b8fe73b9ed530ae50cae9ce4b6344c0b5edefc50478847eaa4
size 11422817

View File

@@ -0,0 +1,244 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"fix_mistral_regex": true,
"max_length": null,
"model_max_length": 131072,
"pad_to_multiple_of": null,
"pad_token": "<|endoftext|>",
"pad_token_type_id": 0,
"padding_side": "left",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

1
safetensors/vocab.json Normal file

File diff suppressed because one or more lines are too long