初始化项目，由ModelHub XC社区提供模型

Model: srs6901/Vikras-MixP Source: Original Platform
2026-04-12 18:44:00 +08:00
commit 7a7da18513
86 changed files with 1091656 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,44 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 Vikra-MixP_4.9b_S.gguf filter=lfs diff=lfs merge=lfs -text
 Vikra-MixedP-MXFP4.gguf filter=lfs diff=lfs merge=lfs -text
 Vikra-HCT-YeAM-LLaGemma-1B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 Vikra-HCT-YeAM-LLaGemma-1B/tokenizer.json filter=lfs diff=lfs merge=lfs -text
 Vikra-HCT-YeAM-PhiMma-1B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 Vikra-HCT-YeAM-PhiMma-1B/tokenizer.json filter=lfs diff=lfs merge=lfs -text
 Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
 Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B_Q8_K.gguf filter=lfs diff=lfs merge=lfs -text
 Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,463 @@
 ---
 library_name: transformers
 tags:
 - quantized
 - custom
 - nonlinear
 - mixed-precision
 - merged
 - MoK
 language:
 - ru
 - en
 metrics:
 - perplexity
 pipeline_tag: text-generation
 ---
 # Vikras — Experimental Family of Language Models
 [EN below](#vikras--experimental-family-of-language-models-en)
 ## Содержание
 - [Коротко о проекте](#коротко-о-проекте)
 - [Текущий релиз: HCT/YeAM](#текущий-релиз-hctyeam)
 - [HCT (архитектура) / YeAM (инвариант реализации)](#hct-архитектура--yeam-инвариант-реализации)
 - [Предыдущий релиз: Vikra MixedPrc (MixP_4.9b_S)](#предыдущий-релиз-vikra-mixedprc-mixp_49b_s)
 - [MixP_4.9b_S: детали](#mixp_49b_s-детали)
 - [Планы развития](#планы-развития)
 - [Использование](#использование)
 - [Заключение](#заключение)
 ---
 ## Коротко о проекте
 **Vikra** — экспериментальное семейство языковых моделей, исследующее влияние:
 - геометрии представлений
 - квантования
 - гибридных мерджей
 на численную динамику трансформеров.
 Проект **Vikras** не ограничивается одной базой или одной архитектурой: это семейство моделей, объединённых идеей численной инвариантности эксперимента.
 - **Vikra_%** — имя конкретной модели
 - **Vikras** — семейство экспериментов
 - **S / M / L** — степень агрессивности и распределения битности
 - **MixP / FullP / HCT** — схемы и инварианты квантования/мерджей
 ---
 ## Текущий релиз: HCT/YeAM
 ### Релизы
 - **Vikra-HCT-YeAM-PhiMma-1B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-PhiMma-1B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-PhiMma-1B-Q8_0.gguf
 - **Vikra-HCT-YeAM-LLaGemma-1B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-LLaGemma-1B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-LLaGemma-1B-Q8_0.gguf
 - **Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B_Q8_K.gguf
 - **Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B-Q6_K.gguf
 ---
 ## HCT (архитектура) / YeAM (инвариант реализации)
 **HCT** — архитектурный инвариант: практический способ собирать совместимые модели и производные релизы при переносе между базами/семействами.
 **YeAM (Yet Another Merge)** — инвариант реализации HCT и самостоятельная схема мерджа HF→HF: это не «ещё один SLERP/DARE/TILES» и не косметическая вариация усреднения.
 YeAM выдаёт стандартный HF-результат (safetensors + index) и поддерживает:
 - прямой weight-to-weight мердж
 - направленное добавление знаний в выбранную модель (knowledge distillation / knowledge injection), согласованное по нескольким источникам
 - дополнительный мердж Attention-слоёв как отдельную технику поверх YeAM
 - мердж меньших моделей в более крупные (scale-up merge) при сохранении совместимого HF-формата
 Математически YeAM работает в **реальной 4D-постановке**: обновления кодируются геометрически и согласуются через пересечения лучей в пространстве параметров. Это даёт управляемый мердж с сохранением структуры и без вырождения в наивное усреднение.
 ---
 ## Предыдущий релиз: Vikra MixedPrc (MixP_4.9b_S)
 ### Краткое описание
 12.25B Mistral-based language model  
 Hybrid mixed-precision merged GGUF quantization  
 Экспериментальный режим анизотропного квантования
 Полная версия мерджа (без квантования):
 https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-FullP
 GGUF-квант:
 https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-MixP_4.9b_S.gguf
 ---
 ## MixP_4.9b_S: детали
 ### Архитектура (для MixP релиза)
 | Параметр | Значение |
 |---|---|
 | Architecture | Mistral-based |
 | Params | ~12.25B |
 | Layers | 40 |
 | Hidden size | 5120 |
 | FFN size | 14336 |
 | Heads | 32 (8 KV heads, GQA) |
 | Context | 1,024,000 |
 | Vocab | 131,072 (Tekken BPE) |
 | RoPE theta | 1,000,000 |
 ### MixP_4.9b_S — схема квантования
 Гибридная mixed precision схема с покомпонентным распределением типов.
 | Tensor group | Quant type | BPW |
 |---|---|---|
 | token_embd, output | BF16 | 16 |
 | attn_norm, ffn_norm, output_norm | F32 | 32 |
 | attn_q | Q4_K | 4.5 |
 | attn_k | Q5_K | 5.5 |
 | attn_v | Q3_K | 3.44 |
 | attn_output | Q4_K | 4.5 |
 | ffn_gate | Q3_K | 3.44 |
 | ffn_up | Q5_K | 5.5 |
 | ffn_down | Q5_K / Q6_K | 5.5–6.56 |
 Итого:
 - Quantized layers only: ~4.89 BPW
 - Full model average: ~6.11 BPW
 - File size: ~8.71 GB
 ### Ключевая идея MixP
 MixP — это не «сжать всё одинаково».
 Это **анизотропное квантование информационных каналов**:
 • Q/K сохраняются в более высокой точности
 • V и gate намеренно квантованы до Q3_K
 • Нормы и выходной слой остаются в высокой точности
 Такое распределение изменяет численную динамику модели:
 • усиливается структурная sparsification
 • меняется распределение норм скрытых представлений
 • меняется энтропия логитов
 • появляется режимная чувствительность
 Это не новая архитектура.
 Это изменение численной геометрии существующей.
 ### Наблюдаемые эффекты
 - сохранение top-1 предсказаний на простых задачах
 - рост entropy без разрушения максимальной вероятности
 - расширение hidden norm на сложных задачах
 - бифуркация режимов: простые задачи ≈ инвариантны, сложные — чувствительны
 Эти эффекты описываются как геометрический сдвиг представлений, а не как универсальное улучшение качества.
 ### math_subattention (рабочая гипотеза)
 В экспериментах наблюдается эффект, условно обозначенный как:
 “math_subattention”
 Под этим подразумевается:
 • уменьшение вклада мелких компонент V
 • усиление доминирующих направлений residual stream
 • повышенная инерция предыдущего токена
 • снижение частоты мелких переключений логитов
 Это не claim о новой архитектуре.
 Это рабочая гипотеза о динамике, возникающей при Q3_K symmetric quantization.
 Термин используется описательно.
 ### Перплексия
 Метрика измерена на wikitext-2-raw-test (full):
 | Model | Precision | PPL |
 |---|---|---|
 | Vikra MixP_4.9b_S | 6.11 BPW | 5.50 ± 0.03 |
 | Baseline BF16 | Full | 6.02 ± 0.03 |
 ---
 ## Планы развития
 Планируются подсемейства:
 - MixP — Mixed Precision
 - FullP — Full Precision версии
 - HCT — multi-merge эксперименты
 - S / M / L — варианты распределения битности
 Все модели семейства называются **Vikra**.
 Репозиторий — **Vikras**.
 ---
 ## Использование
 ```bash
 llama-cli -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096
 ```
 ```bash
 llama-server -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096
 ```
 ---
 ## Заключение
 Vikras — исследовательский проект.
 Он исследует, как меняется поведение трансформера, если его:
 - сжимать
 - смешивать
 - изменять численную геометрию
 Если вам интересны hidden space dynamics / regime sensitivity / anisotropic quantization — добро пожаловать.
 ---
 # Vikras — Experimental Family of Language Models (EN)
 ## Table of Contents
 - [Project overview](#project-overview)
 - [Current Release: HCT/YeAM](#current-release-hctyeam)
 - [HCT (architecture) / YeAM (implementation invariant)](#hct-architecture--yeam-implementation-invariant)
 - [Previous Release: Vikra MixedPrc (MixP_4.9b_S)](#previous-release-vikra-mixedprc-mixp_49b_s)
 - [MixP_4.9b_S: details](#mixp_49b_s-details)
 - [Roadmap](#roadmap)
 - [Usage](#usage)
 - [Closing](#closing)
 ---
 ## Project overview
 **Vikra** is an experimental family of language models exploring how:
 - representation geometry
 - quantization
 - hybrid merges
 affect transformer numerical dynamics.
 The **Vikras** project is not tied to a single base model or architecture.
 It is a family of models unified by a numerical invariance philosophy of experimentation.
 - **Vikra_%** — a specific model
 - **Vikras** — the experimental family
 - **S / M / L** — aggressiveness and bit allocation variants
 - **MixP / FullP / HCT** — quantization / merge invariants
 ---
 ## Current Release: HCT/YeAM
 ### Releases
 - **Vikra-HCT-YeAM-PhiMma-1B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-PhiMma-1B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-PhiMma-1B-Q8_0.gguf
 - **Vikra-HCT-YeAM-LLaGemma-1B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-LLaGemma-1B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-LLaGemma-1B-Q8_0.gguf
 - **Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B_Q8_K.gguf
 - **Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B**
  - HF: https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B
  - GGUF: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B-Q6_K.gguf
 ---
 ## HCT (architecture) / YeAM (implementation invariant)
 **HCT** is an architectural invariant.
 In English: **Heterogeneous Compatibility Transfer** — a practical way to assemble compatible checkpoints and derived releases while moving across bases / model families.
 **YeAM (Yet Another Merge)** is an implementation invariant of HCT and a standalone HF→HF merge scheme: it is not “just another SLERP/DARE/TILES” and not a cosmetic variant of averaging.
 YeAM produces a standard HF output (safetensors + index) and supports:
 - direct weight-to-weight merging
 - targeted knowledge injection into a chosen model (knowledge distillation mode), aligned across multiple sources
 - an additional Attention-layer merge as a second technique on top of YeAM
 - merging smaller models into larger ones (scale-up merge) while keeping a compatible HF format
 YeAM operates in a **real 4D formulation**: updates are encoded geometrically and aligned via ray intersections in parameter space. This produces controlled merges that preserve structure instead of collapsing into naive averaging.
 ---
 ## Previous Release: Vikra MixedPrc (MixP_4.9b_S)
 ### Short Description
 12.25B Mistral-based language model  
 Hybrid mixed-precision merged GGUF quantization  
 Experimental anisotropic quantization regime
 Full merge version (non-quantized):
 https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-FullP
 GGUF quant:
 https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-MixP_4.9b_S.gguf
 ---
 ## MixP_4.9b_S: details
 ### Architecture (for the MixP release)
 | Parameter | Value |
 |---|---|
 | Architecture | Mistral-based |
 | Params | ~12.25B |
 | Layers | 40 |
 | Hidden size | 5120 |
 | FFN size | 14336 |
 | Heads | 32 (8 KV heads, GQA) |
 | Context | 1,024,000 |
 | Vocab | 131,072 (Tekken BPE) |
 | RoPE theta | 1,000,000 |
 ### MixP_4.9b_S — Quantization Scheme
 A hybrid mixed-precision scheme with per-tensor type allocation.
 | Tensor group | Quant type | BPW |
 |---|---|---|
 | token_embd, output | BF16 | 16 |
 | attn_norm, ffn_norm, output_norm | F32 | 32 |
 | attn_q | Q4_K | 4.5 |
 | attn_k | Q5_K | 5.5 |
 | attn_v | Q3_K | 3.44 |
 | attn_output | Q4_K | 4.5 |
 | ffn_gate | Q3_K | 3.44 |
 | ffn_up | Q5_K | 5.5 |
 | ffn_down | Q5_K / Q6_K | 5.5–6.56 |
 Totals:
 - Quantized layers only: ~4.89 BPW
 - Full model average: ~6.11 BPW
 - File size: ~8.71 GB
 ### Core idea of MixP
 MixP is not “compress everything equally”.
 It is **anisotropic quantization of information channels**:
 - Q/K remain in higher precision
 - V and gate are intentionally quantized down to Q3_K
 - norms and the output layer remain in higher precision
 This redistribution changes the numerical dynamics of the model:
 - increased structural sparsification
 - shifts in hidden norm distribution
 - changes in logit entropy
 - regime sensitivity
 This is not a new architecture.
 It is a modification of the numerical geometry of an existing one.
 ### Observed effects
 - preservation of top-1 predictions on simple tasks
 - increased entropy without collapse of maximum probability
 - expansion of hidden norms on complex tasks
 - mode bifurcation: simple tasks ≈ invariant, complex tasks sensitive
 These effects are interpreted as a geometric shift of representations rather than a universal quality improvement.
 ### math_subattention (working hypothesis)
 In experiments, an effect informally referred to as:
 “math_subattention”
 This describes:
 - reduced contribution of small V components
 - dominance of stronger residual directions
 - increased inertia from previous token state
 - reduced frequency of small logit switching
 This is not an architectural claim.
 It is a working hypothesis of dynamics emerging from Q3_K symmetric quantization.
 The term is used descriptively.
 ### Perplexity
 Measured on wikitext-2-raw-test (full):
 | Model | Precision | PPL |
 |---|---|---|
 | Vikra MixP_4.9b_S | 6.11 BPW | 5.50 ± 0.03 |
 | Baseline BF16 | Full | 6.02 ± 0.03 |
 ---
 ## Roadmap
 Planned subfamilies:
 - MixP — Mixed Precision
 - FullP — Full Precision variants
 - HCT — multi-merge experiments
 - S / M / L — different bit allocation regimes
 All models in the family are called **Vikra**.
 The repository is **Vikras**.
 ---
 ## Usage
 ```bash
 llama-cli -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096
 ```
 ```bash
 llama-server -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096
 ```
 ---
 ## Closing
 Vikras is a research project.
 It explores how transformer behavior changes when we:
 - compress
 - merge
 - alter numerical geometry
 If you are interested in hidden space dynamics / regime sensitivity / anisotropic quantization — welcome.
--- a/Vikra-FullP/config.json
+++ b/Vikra-FullP/config.json
@@ -0,0 +1,27 @@
 {
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "dtype": "bfloat16",
  "eos_token_id": 2,
  "head_dim": 128,
  "hidden_act": "silu",
  "hidden_size": 5120,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 1024000,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 40,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "transformers_version": "4.57.3",
  "use_cache": true,
  "vocab_size": 131072,
  "_name_or_path": "Vikra MixedPrc"
 }
--- a/Vikra-FullP/model-00001-of-00005.safetensors
+++ b/Vikra-FullP/model-00001-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:bad11922e79fff3687ad25e68c8746fa60e12c7b228d6810664bd51e4e732109
 size 4865489336
--- a/Vikra-FullP/model-00002-of-00005.safetensors
+++ b/Vikra-FullP/model-00002-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:8db601b1ed1d171d4bbccf02a451a85bbd46e2b4b68240cb6df58ecbf9376c0a
 size 4907529456
--- a/Vikra-FullP/model-00003-of-00005.safetensors
+++ b/Vikra-FullP/model-00003-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:1033a9661f0c78949f17034f7c3d526438b5eabd53fd6bc66c60e6d3be6b33e6
 size 4907529464
--- a/Vikra-FullP/model-00004-of-00005.safetensors
+++ b/Vikra-FullP/model-00004-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:fb676dad5f501c2ff1bb9d2c953426ce6c63d4f68b5778ef6363442b24b0c044
 size 4907529456
--- a/Vikra-FullP/model-00005-of-00005.safetensors
+++ b/Vikra-FullP/model-00005-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:7972467102cca43624144be259da7ed0c470261f8d59d40783166b5180c77f06
 size 4907529392
--- a/Vikra-FullP/model.safetensors.index.json
+++ b/Vikra-FullP/model.safetensors.index.json
@@ -0,0 +1,371 @@
 {
  "metadata": {
    "total_size": 24495564800,
    "mergekit_version": "0.1.4"
  },
  "weight_map": {
    "lm_head.weight": "model-00001-of-00005.safetensors",
    "model.embed_tokens.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.12.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.12.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.20.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.20.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.29.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.29.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.36.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.37.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.37.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.37.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.37.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.37.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.37.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.37.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.37.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.37.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.38.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.39.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.4.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.5.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.6.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.7.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.8.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.input_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.mlp.down_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.mlp.gate_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.mlp.up_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.post_attention_layernorm.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.self_attn.k_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.self_attn.o_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.self_attn.q_proj.weight": "model-00005-of-00005.safetensors",
    "model.layers.9.self_attn.v_proj.weight": "model-00005-of-00005.safetensors",
    "model.norm.weight": "model-00005-of-00005.safetensors"
  }
 }
--- a/Vikra-FullP/special_tokens_map.json
+++ b/Vikra-FullP/special_tokens_map.json
@@ -0,0 +1,23 @@
 {
  "bos_token": {
    "content": "<s>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "</s>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/Vikra-FullP/tokenizer.json
+++ b/Vikra-FullP/tokenizer.json
--- a/Vikra-FullP/tokenizer_config.json
+++ b/Vikra-FullP/tokenizer_config.json
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/LICENSE
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/LICENSE
@@ -0,0 +1,202 @@
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/
   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
   1. Definitions.
      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.
      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.
      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.
      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.
      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.
      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.
      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).
      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.
      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."
      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.
   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.
   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.
   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:
      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and
      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and
      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and
      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.
      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.
   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.
   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.
   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.
   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.
   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.
   END OF TERMS AND CONDITIONS
   APPENDIX: How to apply the Apache License to your work.
      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.
   Copyright 2024 Alibaba Cloud
   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at
       http://www.apache.org/licenses/LICENSE-2.0
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/README.md
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/README.md
@@ -0,0 +1,69 @@
 ---
 license: apache-2.0
 library_name: transformers
 pipeline_tag: text-generation
 base_model: Qwen/Qwen3-1.7B-Base
 ---
 # Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B
 HCT architecture release. YeAM (Yet Another Merge) implementation invariant.
 ## What it is
 A compact 1.7B-class checkpoint produced via HCT-compatible merging.
 The checkpoint is published in standard Hugging Face format (safetensors + index).
 ## YeAM summary
 YeAM performs a controlled merge in a real 4D geometric formulation with ray-intersection alignment in parameter space.
 It also supports targeted knowledge injection (distillation-style) into a chosen model while remaining HF-compatible.
 ## Notes for this checkpoint
 Compared to other YeAM/HCT merges, this checkpoint additionally applies a targeted merge on Attention projection weights.
 Observed behavior tends to include characteristic Llama-like traits:
 - More Llama-style conversation patterns.
 - More consistent formatting.
 - Stronger RLHF-like refusal/priority behaviors.
 - Reasoning / chain-of-thought style output in the model's full native format is expected to work.
 At the same time, most Qwen3 behavior should theoretically remain, but due to knowledge/logic injection from the Llama side, some Qwen-specific properties may be partially degraded or inconsistent.
 Repetition / looping:
 - There is no universally perfect sampling configuration.
 - At higher temperature, without a repetition-style penalty, the model may enter repetition loops.
 - Pay special attention to repetition-related controls (e.g. repetition penalty / presence penalty) if you observe cycling.
 Do not ask the model who created it.
 In this specific merge, it may oscillate between incompatible parents (Alibaba vs Meta”), fail to settle, and get stuck in a sad loop.
 ## Usage (Transformers)
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 m = "/path/to/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B"
 tok = AutoTokenizer.from_pretrained(m, use_fast=True)
 model = AutoModelForCausalLM.from_pretrained(
    m,
    torch_dtype=torch.bfloat16,
    device_map="cuda",
 ).eval()
 inputs = tok("Hello!", return_tensors="pt").to(model.device)
 out = model.generate(**inputs, max_new_tokens=128)
 print(tok.decode(out[0], skip_special_tokens=True))
 ```
 ## GGUF
 Convert and quantize with llama.cpp (example):
 ```bash
 python3 /path/to/llama.cpp/convert_hf_to_gguf.py /path/to/model --outtype f16 --outfile model.f16.gguf
 /path/to/llama.cpp/build/bin/llama-quantize model.f16.gguf model.Q8_0.gguf Q8_0
 ```
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/config.json
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/config.json
@@ -0,0 +1,30 @@
 {
  "architectures": [
    "Qwen3ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "head_dim": 128,
  "hidden_act": "silu",
  "hidden_size": 2048,
  "initializer_range": 0.02,
  "intermediate_size": 6144,
  "max_position_embeddings": 40960,
  "max_window_layers": 28,
  "model_type": "qwen3",
  "num_attention_heads": 16,
  "num_hidden_layers": 28,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 1000000,
  "sliding_window": null,
  "tie_word_embeddings": true,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.51.0",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151936
 }
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/generation_config.json
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/generation_config.json
@@ -0,0 +1,13 @@
 {
    "bos_token_id": 151643,
    "do_sample": true,
    "eos_token_id": [
        151645,
        151643
    ],
    "pad_token_id": 151643,
    "temperature": 0.6,
    "top_k": 20,
    "top_p": 0.95,
    "transformers_version": "4.51.0"
 }
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/merges.txt
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/merges.txt
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00001-of-00004.safetensors
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00001-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c76fa123a80c02c23b893f5f70b5261371826def47d7141b2c2375eb0923a92d
 size 1244659944
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00002-of-00004.safetensors
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00002-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:44210ebae89ff50ed223b66c48fb5bd3da690983ce3b10cc76289f89b9c2085b
 size 1082234512
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00003-of-00004.safetensors
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00003-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0e54f9d3e91602558c7b53caeb8061dbcb726959a5c9ee30f3e27e99716a50b0
 size 1082239816
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00004-of-00004.safetensors
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model-00004-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:b96e907df89c1934843aeb61b26a84b805b339ab04dfb831ffef903b6112b817
 size 654380800
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model.safetensors.index.json
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/model.safetensors.index.json
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/tokenizer.json
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/tokenizer.json
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/tokenizer_config.json
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/tokenizer_config.json
@@ -0,0 +1,239 @@
 {
  "add_bos_token": false,
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "151643": {
      "content": "<|endoftext|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151644": {
      "content": "<|im_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151645": {
      "content": "<|im_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151646": {
      "content": "<|object_ref_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151647": {
      "content": "<|object_ref_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151648": {
      "content": "<|box_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151649": {
      "content": "<|box_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151650": {
      "content": "<|quad_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151651": {
      "content": "<|quad_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151652": {
      "content": "<|vision_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151653": {
      "content": "<|vision_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151654": {
      "content": "<|vision_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151655": {
      "content": "<|image_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151656": {
      "content": "<|video_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151657": {
      "content": "<tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151658": {
      "content": "</tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151659": {
      "content": "<|fim_prefix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151660": {
      "content": "<|fim_middle|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151661": {
      "content": "<|fim_suffix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151662": {
      "content": "<|fim_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151663": {
      "content": "<|repo_name|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151664": {
      "content": "<|file_sep|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151665": {
      "content": "<tool_response>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151666": {
      "content": "</tool_response>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151667": {
      "content": "<think>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151668": {
      "content": "</think>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    }
  },
  "additional_special_tokens": [
    "<|im_start|>",
    "<|im_end|>",
    "<|object_ref_start|>",
    "<|object_ref_end|>",
    "<|box_start|>",
    "<|box_end|>",
    "<|quad_start|>",
    "<|quad_end|>",
    "<|vision_start|>",
    "<|vision_end|>",
    "<|vision_pad|>",
    "<|image_pad|>",
    "<|video_pad|>"
  ],
  "bos_token": null,
  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0].role == 'system' %}\n        {{- messages[0].content + '\\n\\n' }}\n    {%- endif %}\n    {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0].role == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n    {%- set index = (messages|length - 1) - loop.index0 %}\n    {%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n        {%- set ns.multi_step_tool = false %}\n        {%- set ns.last_query_index = index %}\n    {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n    {%- if message.content is string %}\n        {%- set content = message.content %}\n    {%- else %}\n        {%- set content = '' %}\n    {%- endif %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n        {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {%- set reasoning_content = '' %}\n        {%- if message.reasoning_content is string %}\n            {%- set reasoning_content = message.reasoning_content %}\n        {%- else %}\n            {%- if '</think>' in content %}\n                {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n                {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n            {%- endif %}\n        {%- endif %}\n        {%- if loop.index0 > ns.last_query_index %}\n            {%- if loop.last or (not loop.last and reasoning_content) %}\n                {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n            {%- else %}\n                {{- '<|im_start|>' + message.role + '\\n' + content }}\n            {%- endif %}\n        {%- else %}\n            {{- '<|im_start|>' + message.role + '\\n' + content }}\n        {%- endif %}\n        {%- if message.tool_calls %}\n            {%- for tool_call in message.tool_calls %}\n                {%- if (loop.first and content) or (not loop.first) %}\n                    {{- '\\n' }}\n                {%- endif %}\n                {%- if tool_call.function %}\n                    {%- set tool_call = tool_call.function %}\n                {%- endif %}\n                {{- '<tool_call>\\n{\"name\": \"' }}\n                {{- tool_call.name }}\n                {{- '\", \"arguments\": ' }}\n                {%- if tool_call.arguments is string %}\n                    {{- tool_call.arguments }}\n                {%- else %}\n                    {{- tool_call.arguments | tojson }}\n                {%- endif %}\n                {{- '}\\n</tool_call>' }}\n            {%- endfor %}\n        {%- endif %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n    {%- if enable_thinking is defined and enable_thinking is false %}\n        {{- '<think>\\n\\n</think>\\n\\n' }}\n    {%- endif %}\n{%- endif %}",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "errors": "replace",
  "model_max_length": 131072,
  "pad_token": "<|endoftext|>",
  "split_special_tokens": false,
  "tokenizer_class": "Qwen2Tokenizer",
  "unk_token": null
 }
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/vocab.json
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B/vocab.json
--- a/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B_Q8_K.gguf
+++ b/Vikra-HCT-YeAM-3_3.2_QweLLa-1.7B_Q8_K.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:e5c3eab905f68c6f9001d3af2c17c538a1583e6804106d048e0a2b0ca881cd7c
 size 2165039488
--- a/Vikra-HCT-YeAM-LLaGemma-1B-Q8_0.gguf
+++ b/Vikra-HCT-YeAM-LLaGemma-1B-Q8_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:106774be0e59d7939904784f171932c95b759fdbe979948a6df143c966903ea3
 size 1069306496
--- a/Vikra-HCT-YeAM-LLaGemma-1B/README.md
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/README.md
@@ -0,0 +1,50 @@
 ---
 license: gemma
 library_name: transformers
 pipeline_tag: text-generation
 base_model: google/gemma-3-1b-pt
 ---
 # Vikra-HCT-YeAM-LLaGemma-1B
 Llama-3.2-1B-Instruct + Gemma-3-1b-pt
 HCT architecture release. YeAM (Yet Another Merge) implementation invariant.
 ## What it is
 A compact 1B-class model produced via HCT-compatible merging.
 The checkpoint is published in standard Hugging Face format (safetensors + index).
 ## YeAM summary
 YeAM performs a controlled merge in a real 4D geometric formulation with ray-intersection alignment in parameter space.
 It also supports targeted knowledge injection (distillation-style) into a chosen model while remaining HF-compatible.
 ## Usage (Transformers)
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 m = "/path/to/Vikra-HCT-YeAM-LLaGemma-1B"
 tok = AutoTokenizer.from_pretrained(m, use_fast=False)
 model = AutoModelForCausalLM.from_pretrained(
    m,
    torch_dtype=torch.bfloat16,
    device_map="cuda",
 ).eval()
 inputs = tok("Hello!", return_tensors="pt").to(model.device)
 out = model.generate(**inputs, max_new_tokens=128)
 print(tok.decode(out[0], skip_special_tokens=True))
 ```
 ## GGUF
 Convert and quantize with llama.cpp (example):
 ```bash
 python3 /path/to/llama.cpp/convert_hf_to_gguf.py /path/to/model --outtype bf16 --outfile model.bf16.gguf
 /path/to/llama.cpp/build/bin/llama-quantize model.bf16.gguf model.Q6_K.gguf Q6_K
 ```
--- a/Vikra-HCT-YeAM-LLaGemma-1B/added_tokens.json
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/added_tokens.json
@@ -0,0 +1,3 @@
 {
  "<image_soft_token>": 262144
 }
--- a/Vikra-HCT-YeAM-LLaGemma-1B/config.json
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/config.json
@@ -0,0 +1,37 @@
 {
  "architectures": [
    "Gemma3ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "attn_logit_softcapping": null,
  "bos_token_id": 2,
  "cache_implementation": "hybrid",
  "eos_token_id": [
    1,
    106
  ],
  "final_logit_softcapping": null,
  "head_dim": 256,
  "hidden_activation": "gelu_pytorch_tanh",
  "hidden_size": 1152,
  "initializer_range": 0.02,
  "intermediate_size": 6912,
  "max_position_embeddings": 32768,
  "model_type": "gemma3_text",
  "num_attention_heads": 4,
  "num_hidden_layers": 26,
  "num_key_value_heads": 1,
  "pad_token_id": 0,
  "query_pre_attn_scalar": 256,
  "rms_norm_eps": 1e-06,
  "rope_local_base_freq": 10000,
  "rope_scaling": null,
  "rope_theta": 1000000,
  "sliding_window": 512,
  "sliding_window_pattern": 6,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.50.0.dev0",
  "use_cache": true,
  "vocab_size": 262144
 }
--- a/Vikra-HCT-YeAM-LLaGemma-1B/generation_config.json
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/generation_config.json
@@ -0,0 +1,13 @@
 {
  "bos_token_id": 2,
  "cache_implementation": "hybrid",
  "do_sample": true,
  "eos_token_id": [
    1,
    106
  ],
  "pad_token_id": 0,
  "top_k": 64,
  "top_p": 0.95,
  "transformers_version": "4.50.0.dev0"
 }
--- a/Vikra-HCT-YeAM-LLaGemma-1B/model-00001-of-00002.safetensors
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/model-00001-of-00002.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0d327513594d6e96139c1660d5389b20cc9de3797796b69ca9cb8d21ad420c31
 size 1081244120
--- a/Vikra-HCT-YeAM-LLaGemma-1B/model-00002-of-00002.safetensors
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/model-00002-of-00002.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:ff4980829a32e0feccf675d107ffc2ebc495139577c3ce7460ffc3a0fe8ff76f
 size 918566568
--- a/Vikra-HCT-YeAM-LLaGemma-1B/model.safetensors.index.json
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/model.safetensors.index.json
--- a/Vikra-HCT-YeAM-LLaGemma-1B/special_tokens_map.json
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/special_tokens_map.json
@@ -0,0 +1,33 @@
 {
  "boi_token": "<start_of_image>",
  "bos_token": {
    "content": "<bos>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eoi_token": "<end_of_image>",
  "eos_token": {
    "content": "<eos>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "image_token": "<image_soft_token>",
  "pad_token": {
    "content": "<pad>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/Vikra-HCT-YeAM-LLaGemma-1B/tokenizer.json
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/tokenizer.json
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
 size 33384568
--- a/Vikra-HCT-YeAM-LLaGemma-1B/tokenizer.model
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/tokenizer.model
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
 size 4689074
--- a/Vikra-HCT-YeAM-LLaGemma-1B/tokenizer_config.json
+++ b/Vikra-HCT-YeAM-LLaGemma-1B/tokenizer_config.json
--- a/Vikra-HCT-YeAM-PhiMma-1B-Q8_0.gguf
+++ b/Vikra-HCT-YeAM-PhiMma-1B-Q8_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:2cc9a6638b222213cc705401efb8d4ae2aac38bb0c25f4c3315d64be3d6587aa
 size 1069306496
--- a/Vikra-HCT-YeAM-PhiMma-1B/README.md
+++ b/Vikra-HCT-YeAM-PhiMma-1B/README.md
@@ -0,0 +1,49 @@
 ---
 license: gemma
 library_name: transformers
 pipeline_tag: text-generation
 base_model: google/gemma-3-1b-pt
 ---
 # Vikra-HCT-YeAM-PhiMma-1B
 Gemma-3-1b-pt + Microsoft_phi-2
 HCT architecture release. YeAM (Yet Another Merge) implementation invariant.
 ## What it is
 A compact 1B-class model produced via HCT-compatible merging.
 The checkpoint is published in standard Hugging Face format (safetensors + index).
 ## YeAM summary
 YeAM performs a controlled merge in a real 4D geometric formulation with ray-intersection alignment in parameter space.
 It also supports targeted knowledge injection (distillation-style) into a chosen model while remaining HF-compatible.
 ## Usage (Transformers)
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 m = "/path/to/Vikra-HCT-YeAM-PhiMma-1B"
 tok = AutoTokenizer.from_pretrained(m, use_fast=False)
 model = AutoModelForCausalLM.from_pretrained(
    m,
    torch_dtype=torch.bfloat16,
    device_map="cuda",
 ).eval()
 inputs = tok("Hello!", return_tensors="pt").to(model.device)
 out = model.generate(**inputs, max_new_tokens=128)
 print(tok.decode(out[0], skip_special_tokens=True))
 ```
 ## GGUF
 Convert and quantize with llama.cpp (example):
 ```bash
 python3 /path/to/llama.cpp/convert_hf_to_gguf.py /path/to/model --outtype bf16 --outfile model.bf16.gguf
 /path/to/llama.cpp/build/bin/llama-quantize model.bf16.gguf model.Q6_K.gguf Q6_K
 ```
--- a/Vikra-HCT-YeAM-PhiMma-1B/added_tokens.json
+++ b/Vikra-HCT-YeAM-PhiMma-1B/added_tokens.json
@@ -0,0 +1,3 @@
 {
  "<image_soft_token>": 262144
 }
--- a/Vikra-HCT-YeAM-PhiMma-1B/config.json
+++ b/Vikra-HCT-YeAM-PhiMma-1B/config.json
@@ -0,0 +1,37 @@
 {
  "architectures": [
    "Gemma3ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "attn_logit_softcapping": null,
  "bos_token_id": 2,
  "cache_implementation": "hybrid",
  "eos_token_id": [
    1,
    106
  ],
  "final_logit_softcapping": null,
  "head_dim": 256,
  "hidden_activation": "gelu_pytorch_tanh",
  "hidden_size": 1152,
  "initializer_range": 0.02,
  "intermediate_size": 6912,
  "max_position_embeddings": 32768,
  "model_type": "gemma3_text",
  "num_attention_heads": 4,
  "num_hidden_layers": 26,
  "num_key_value_heads": 1,
  "pad_token_id": 0,
  "query_pre_attn_scalar": 256,
  "rms_norm_eps": 1e-06,
  "rope_local_base_freq": 10000,
  "rope_scaling": null,
  "rope_theta": 1000000,
  "sliding_window": 512,
  "sliding_window_pattern": 6,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.50.0.dev0",
  "use_cache": true,
  "vocab_size": 262144
 }
--- a/Vikra-HCT-YeAM-PhiMma-1B/generation_config.json
+++ b/Vikra-HCT-YeAM-PhiMma-1B/generation_config.json
@@ -0,0 +1,13 @@
 {
  "bos_token_id": 2,
  "cache_implementation": "hybrid",
  "do_sample": true,
  "eos_token_id": [
    1,
    106
  ],
  "pad_token_id": 0,
  "top_k": 64,
  "top_p": 0.95,
  "transformers_version": "4.50.0.dev0"
 }
--- a/Vikra-HCT-YeAM-PhiMma-1B/model-00001-of-00002.safetensors
+++ b/Vikra-HCT-YeAM-PhiMma-1B/model-00001-of-00002.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:cd10088d6f4ea7bd681d61f3ae77b02e6202587e0817b5d91dad2e949bf248b7
 size 1081244120
--- a/Vikra-HCT-YeAM-PhiMma-1B/model-00002-of-00002.safetensors
+++ b/Vikra-HCT-YeAM-PhiMma-1B/model-00002-of-00002.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:15d8aa3dfd280419d37ea5b3e79ea12aabf419cf96cbc11b17c8f4c9bb5a4aaa
 size 918566568
--- a/Vikra-HCT-YeAM-PhiMma-1B/model.safetensors.index.json
+++ b/Vikra-HCT-YeAM-PhiMma-1B/model.safetensors.index.json
--- a/Vikra-HCT-YeAM-PhiMma-1B/special_tokens_map.json
+++ b/Vikra-HCT-YeAM-PhiMma-1B/special_tokens_map.json
@@ -0,0 +1,33 @@
 {
  "boi_token": "<start_of_image>",
  "bos_token": {
    "content": "<bos>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eoi_token": "<end_of_image>",
  "eos_token": {
    "content": "<eos>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "image_token": "<image_soft_token>",
  "pad_token": {
    "content": "<pad>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/Vikra-HCT-YeAM-PhiMma-1B/tokenizer.json
+++ b/Vikra-HCT-YeAM-PhiMma-1B/tokenizer.json
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
 size 33384568
--- a/Vikra-HCT-YeAM-PhiMma-1B/tokenizer.model
+++ b/Vikra-HCT-YeAM-PhiMma-1B/tokenizer.model
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
 size 4689074
--- a/Vikra-HCT-YeAM-PhiMma-1B/tokenizer_config.json
+++ b/Vikra-HCT-YeAM-PhiMma-1B/tokenizer_config.json
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B-Q6_K.gguf
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B-Q6_K.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:7c4602a441fbf8196b67d81a195f23cd967449f7e8b7451daf92384955c3a8b5
 size 8727647264
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/README.md
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/README.md
@@ -0,0 +1,51 @@
 ---
 license: apache-2.0
 library_name: transformers
 pipeline_tag: text-generation
 base_model:
 - mistralai/Mistral-Nemo-Instruct-2407
 language:
 - en
 - ru
 ---
 # Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B
 HCT architecture release. YeAM (Yet Another Merge) implementation invariant.
 ## What it is
 A large (12B-class) checkpoint produced via HCT-compatible merging with 1B Gemma.
 Published in standard Hugging Face format (safetensors + sharded index) and intended to be convertible to GGUF.
 ## YeAM summary
 YeAM performs a controlled merge in a real 4D geometric formulation with ray-intersection alignment in parameter space.
 It also supports targeted knowledge injection (distillation-style) into a chosen model while remaining HF-compatible.
 ## Usage (Transformers)
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
 m = "/path/to/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B"
 tok = AutoTokenizer.from_pretrained(m, use_fast=False)
 model = AutoModelForCausalLM.from_pretrained(
    m,
    torch_dtype=torch.bfloat16,
    device_map="auto",
 ).eval()
 inputs = tok("Привет!", return_tensors="pt").to(model.device)
 out = model.generate(**inputs, max_new_tokens=256)
 print(tok.decode(out[0], skip_special_tokens=True))
 ```
 ## GGUF (example)
 ```bash
 python3 /path/to/llama.cpp/convert_hf_to_gguf.py /path/to/model --outtype bf16 --outfile model.bf16.gguf
 /path/to/llama.cpp/build/bin/llama-quantize model.bf16.gguf model.Q6_K.gguf Q6_K
 CUDA_VISIBLE_DEVICES=0,1 /path/to/llama.cpp/build/bin/llama-server -m model.Q6_K.gguf --n-gpu-layers 99 --split-mode layer --tensor-split 1,1
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/config.json
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/config.json
@@ -0,0 +1,27 @@
 {
  "_name_or_path": "Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-05-09-24",
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "head_dim": 128,
  "hidden_act": "silu",
  "hidden_size": 5120,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 1024000,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 40,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.44.2",
  "use_cache": true,
  "vocab_size": 131074
 }
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/generation_config.json
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/generation_config.json
@@ -0,0 +1,6 @@
 {
  "_from_model_config": true,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "transformers_version": "4.44.2"
 }
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00001-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00001-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:ed147b1bc8b485b72024ffeee6c75e9ab8dddd822a0e38c99ca61bff2d75f816
 size 1342197888
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00002-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00002-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:a298abbf07c5711cde1f874e3bfd32b263b16d223fc21821a0745bb5d0226c2a
 size 1342197904
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00003-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00003-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:6d52f1a2cd0f2fd361e94c133cb1fa71630ce94f4c18ee65f8c23f641407e3ef
 size 1080076200
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00004-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00004-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:3a0cf29e2759bfe4d7a96e195b408e1e0439ae19d402fcbcc600ad19607ff8ac
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00005-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00005-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:96c0007eab8fdf6546e65d74c497cf41e5877b094e73fc66b99e94f0dacbd50d
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00006-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00006-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:8fe0b85cbbb3761cc06fb228e7e254e39d68f40c52a7e0ab4af6cb6685bceace
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00007-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00007-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:d086036870e12cc363b568cc68c0f9cb4a2274ab5647eeba8fa52f4a16ed205b
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00008-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00008-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:2e1a2bb26a1fdcc7bdc3d2b4a59732ffcd006ba170117fccd5700c58ad094935
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00009-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00009-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:22e42950814934bf8c89ef04d010029ea9297c87383ec264cfce598ae00e7059
 size 1090562096
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00010-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00010-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:2e9c1b3aa004516a950fbc1a0dcd2aa4740f69e5b2045c64df7409551d392645
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00011-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00011-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:eefb905db5bbaefb874ba5186a812597c3ae6eac02a72f02a7249c5f78eb17c5
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00012-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00012-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:d99b4d25309fb5359ec484d5288f8a43292929f142d35d958105896c12960818
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00013-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00013-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:6b28155a270f86fea50a5548fec83e27ca56f726e82067c083c9687edcb9a4e7
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00014-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00014-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:edc61ac1e876410edf757be8fe7f8b0af3f093b986b73ad0be059224617644c2
 size 1090562096
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00015-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00015-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:91c34fb4acc4ac436dd95fa96b8bb6b9c491af2361014eb695da2bfb013c237e
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00016-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00016-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0fba941e9e6d6eacf945dbbd6d81f68018c25542d2b04add4b864d842ef99946
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00017-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00017-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:f9b81af1b6d443f2b7130983d3b94a5e02daf44dd0e22b5d195833ad0afe502e
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00018-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00018-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:a3e478473ed04add3945febdb599c6d46c9e7abe048fcff831281356336ff27a
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00019-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00019-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:03964a3f635c9dbf39864ca29fc3608c62d4605f1c0e5c6b2c486a5853eefad9
 size 1090562104
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00020-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00020-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:83d58e325954a28fb12818d321fc95e65c0fba4539195afec61164dc42b3c3cc
 size 1090562088
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00021-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00021-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0354aad9a57376a83cb7e275381c1292deab8babfbadb93969639bd107cb7e1e
 size 1090562088
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00022-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00022-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:596611f84d763180d95a36245271e2af2fb9e4ff6f1d307fafd5f5c942a0a107
 size 1090562088
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00023-of-00023.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model-00023-of-00023.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:9ee234d3a8df1dd4c81bed0c5e2a886ca380347c62e4f64451d7ae79a4ee89fd
 size 10496240
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model.safetensors.index.json
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/model.safetensors.index.json
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/original_adapter/README.md
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/original_adapter/README.md
@@ -0,0 +1,202 @@
 ---
 base_model: Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-05-09-24
 library_name: peft
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
 - **Developed by:** [More Information Needed]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
 - **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
 - **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 [More Information Needed]
 ### Downstream Use [optional]
 <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 [More Information Needed]
 ### Out-of-Scope Use
 <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 [More Information Needed]
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 [More Information Needed]
 ### Recommendations
 <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
 [More Information Needed]
 ## Training Details
 ### Training Data
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 [More Information Needed]
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
 [More Information Needed]
 #### Training Hyperparameters
 - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
 <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 [More Information Needed]
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
 <!-- This should link to a Dataset Card if possible. -->
 [More Information Needed]
 #### Factors
 <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 [More Information Needed]
 #### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 [More Information Needed]
 ### Results
 [More Information Needed]
 #### Summary
 ## Model Examination [optional]
 <!-- Relevant interpretability work for the model goes here -->
 [More Information Needed]
 ## Environmental Impact
 <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
 Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
 - **Hardware Type:** [More Information Needed]
 - **Hours used:** [More Information Needed]
 - **Cloud Provider:** [More Information Needed]
 - **Compute Region:** [More Information Needed]
 - **Carbon Emitted:** [More Information Needed]
 ## Technical Specifications [optional]
 ### Model Architecture and Objective
 [More Information Needed]
 ### Compute Infrastructure
 [More Information Needed]
 #### Hardware
 [More Information Needed]
 #### Software
 [More Information Needed]
 ## Citation [optional]
 <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
 [More Information Needed]
 **APA:**
 [More Information Needed]
 ## Glossary [optional]
 <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
 [More Information Needed]
 ## More Information [optional]
 [More Information Needed]
 ## Model Card Authors [optional]
 [More Information Needed]
 ## Model Card Contact
 [More Information Needed]
 ### Framework versions
 - PEFT 0.12.0
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/original_adapter/adapter_config.json
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/original_adapter/adapter_config.json
@@ -0,0 +1,34 @@
 {
  "alpha_pattern": {},
  "auto_mapping": null,
  "base_model_name_or_path": "Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-05-09-24",
  "bias": "none",
  "fan_in_fan_out": false,
  "inference_mode": true,
  "init_lora_weights": true,
  "layer_replication": null,
  "layers_pattern": null,
  "layers_to_transform": null,
  "loftq_config": {},
  "lora_alpha": 96,
  "lora_dropout": 0.05,
  "megatron_config": null,
  "megatron_core": "megatron.core",
  "modules_to_save": null,
  "peft_type": "LORA",
  "r": 96,
  "rank_pattern": {},
  "revision": null,
  "target_modules": [
    "up_proj",
    "gate_proj",
    "q_proj",
    "k_proj",
    "down_proj",
    "o_proj",
    "v_proj"
  ],
  "task_type": "CAUSAL_LM",
  "use_dora": false,
  "use_rslora": false
 }
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/original_adapter/adapter_model.safetensors
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/original_adapter/adapter_model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:3e338736bfba1083e8e081c3d6ef857668843049a34d06e9e01bef06f7d57d74
 size 1368467752
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/special_tokens_map.json
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/special_tokens_map.json
@@ -0,0 +1,34 @@
 {
  "additional_special_tokens": [
    "<|start_header_id|>",
    "<|end_header_id|>"
  ],
  "bos_token": {
    "content": "<s>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "</s>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": {
    "content": "<pad>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/tokenizer.json
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/tokenizer.json
--- a/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/tokenizer_config.json
+++ b/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B/tokenizer_config.json
--- a/Vikra-MixP_4.9b_S.gguf
+++ b/Vikra-MixP_4.9b_S.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:730a94f67e3c4141eabaec7e0cf6235669f2bd1b8743812c919593086efdd109
 size 9365459776
--- a/Vikra-MixedP-MXFP4.gguf
+++ b/Vikra-MixedP-MXFP4.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:76905d2c64820ee9e3067c863a163725e661efdedc8a7929b094d5bd5e9a9dc5
 size 9006650176