初始化项目，由ModelHub XC社区提供模型

Model: AbteeXAILab/lumynax-infused-qwen3-text-gguf Source: Original Platform
2026-06-06 09:18:19 +08:00
commit ca89ce6998
34 changed files with 153601 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,38 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 lumynax-infused-qwen3-text-gguf-f16.gguf filter=lfs diff=lfs merge=lfs -text
 lumynax-infused-qwen3-text-gguf-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 merged_model/tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/LICENSE.txt
+++ b/LICENSE.txt
@@ -0,0 +1,11 @@
 LumynaX Infused Qwen3 Text GGUF
 Copyright (c) AbteeX AI Labs. All rights reserved.
 This release is proprietary. No right to use, copy, modify, distribute, host,
 sublicense, reverse engineer, or create derivative releases is granted except
 under a separate written agreement from AbteeX AI Labs.
 This package may be used only by parties expressly authorized by AbteeX AI Labs.
 Third-party software or model components, if any, remain subject to their own
 licenses and obligations.
--- a/README.md
+++ b/README.md
@@ -0,0 +1,230 @@
 ---
 license: apache-2.0
 library_name: llama.cpp
 pipeline_tag: text-generation
 language:
 - en
 - mi
 tags:
 - abteex-ai-labs
 - aotearoa
 - general
 - gguf
 - local-first
 - lumynax
 - new-zealand
 - qwen
 - sovereign-ai
 - text
 ---
 # LumynaX Infused Qwen3 Text GGUF
 <!-- lumynax-public-release-card:v4 -->
 <p align="center">
  <img src="docs/lumynax-release-overview.svg" alt="LumynaX Infused Qwen3 Text GGUF release overview" width="100%" />
 </p>
 <p align="center">
  <strong>LumynaX model-infusion release by AbteeX AI Labs.</strong><br/>
  Public, non-gated package with runnable local instructions, provenance metadata, checksums, and a release manifest.
 </p>
 <p align="center">
  <a href="#quickstart">Quickstart</a> |
  <a href="#model-profile">Model profile</a> |
  <a href="#runtime-files">Runtime files</a> |
  <a href="#provenance-and-license">Provenance</a> |
  <a href="#validation-status">Validation</a> |
  <a href="#limitations-and-responsible-use">Limitations</a>
 </p>
 ![LumynaX: infusion release](https://img.shields.io/badge/LumynaX-infusion%20release-e08a2c) ![access: public and non-gated](https://img.shields.io/badge/access-public%20and%20non--gated-0a0a0b) ![runtime: llama cpp](https://img.shields.io/badge/runtime-llama%20cpp-726b62) ![format: GGUF](https://img.shields.io/badge/format-GGUF-9a5416) ![audit: pass](https://img.shields.io/badge/audit-pass-4d6b44) ![docs: v4](https://img.shields.io/badge/docs-v4-111827)
 ## Executive Summary
 This repository is a complete LumynaX release package for `AbteeXAILab/lumynax-infused-qwen3-text-gguf`. It is intended to be downloaded as a whole repo, not as a single loose weight file: the model artifact, `quickstart.py`, `requirements.txt`, `release_export_manifest.json`, `checksums.sha256`, license notice, and optional Ollama or Space files are part of the same release contract.
 LumynaX-infused means the upstream artifact is presented through the LumynaX release layer: local-first runtime scaffolding, LumynaX assistant identity, inference-chain metadata, public documentation, integrity files, and Aotearoa New Zealand-oriented workflow positioning. The release manifest records this as a LumynaX packaging and inference-chain layer around the listed upstream artifact; it does not claim a private LumynaX weight merge.
 ## AbteeX LumynaX Public Surface
 This card follows the AbteeX/LumynaX public-facing system used across the release family: warm paper background visuals, black editorial typography, amber proof markers, compact evidence tables, and plain-language runtime instructions. The goal is not decoration; it is operational clarity. A downloader should immediately understand what the package is, what files belong together, what runtime path is expected, what provenance is available, and what limits still apply.
 ## Sovereignty And Run Contract
 | Field | Value |
 | --- | --- |
 | Public surface | AbteeX/LumynaX light editorial system: warm paper, black ink, amber status markers, and evidence-first tables. |
 | Sovereign intent | Package is documented for local-first use, explicit provenance, and controlled deployment near governed data. |
 | Runtime residency | `llama_cpp` runtime can be deployed by the user in their own approved environment. |
 | Model artifact | `lumynax-infused-qwen3-text-gguf-f16.gguf` must stay with manifest, checksums, quickstart, requirements, and license files. |
 | Modalities | `text` |
 | License discipline | `apache-2.0` metadata is surfaced so downstream users can check redistribution and usage terms. |
 | Audit expectation | Record repo id, artifact checksum, runtime command, prompt template, operator, and deployment environment for production use. |
 | Router readiness | Compatible with the LumynaX MaramaRoute registry pattern for sovereign model selection and fallback planning. |
 | Local serving | Preferred first path is llama.cpp or llama-cpp-python with checksum verification before launch. |
 ## Quickstart
 ```bash
 hf download AbteeXAILab/lumynax-infused-qwen3-text-gguf --local-dir lumynax-infused-qwen3-text-gguf
 cd lumynax-infused-qwen3-text-gguf
 pip install -r requirements.txt
 python quickstart.py
 ```
 Direct llama.cpp smoke command:
 ```bash
 llama-cli -m "lumynax-infused-qwen3-text-gguf-f16.gguf" -p "Who are you? Answer as LumynaX in two sentences." -n 160
 ```
 Ollama path:
 ```bash
 ollama create lumynax-infused-qwen3-text-gguf -f ollama/Modelfile
 ollama run lumynax-infused-qwen3-text-gguf
 ```
 ## Model Profile
 | Field | Value |
 | --- | --- |
 | Release | `LumynaX Infused Qwen3 Text GGUF` |
 | Repository | `AbteeXAILab/lumynax-infused-qwen3-text-gguf` |
 | Mode | `Local-first text generation package` |
 | Runtime | `llama_cpp` |
 | Prompt format | `huggingface_chat_template` |
 | Modalities | `text` |
 | Primary artifact | `lumynax-infused-qwen3-text-gguf-f16.gguf` |
 | Detected weight size | `35.20 GB` |
 | Package state | `base_weights_hydrated_text_gguf` |
 | Delivery | `standalone_hf_text_gguf_release` |
 | Upstream/base | `Qwen/Qwen3-8B` |
 | Upstream kind | `official_base_weights` |
 | Source GGUF | `not applicable` |
 | Quantization | `See manifest` |
 | License metadata | `apache-2.0` |
 | Refreshed | `2026-05-11` |
 ## Runtime Path
 <p align="center">
  <img src="docs/lumynax-runtime-flow.svg" alt="LumynaX Infused Qwen3 Text GGUF runtime flow" width="100%" />
 </p>
 ## Capability Profile
 | Field | Value |
 | --- | --- |
 | Primary fit | Use this for local chat, drafting, summarization, governance notes, and repeatable offline-friendly inference. |
 | Operational style | Local-first package with explicit files, checksums, and reproducible quickstarts. |
 | Identity behavior | The assistant should identify as LumynaX while remaining clear about upstream provenance. |
 ## Runtime Files
 | Component | Status | Path |
 | --- | --- | --- |
 | README.md | `present` | `README.md` |
 | Quickstart | `present` | `quickstart.py` |
 | Requirements | `present` | `requirements.txt` |
 | Manifest | `present` | `release_export_manifest.json` |
 | Checksums | `present` | `checksums.sha256` |
 | License | `present` | `LICENSE.txt` |
 | Ollama | `present` | `ollama/Modelfile` |
 | Space scaffold | `present` | `hf_space/app.py` |
 | Overview visual | `present` | `docs/lumynax-release-overview.svg` |
 | Runtime visual | `present` | `docs/lumynax-runtime-flow.svg` |
 ## Model Artifacts
 | Artifact | Size |
 | --- | ---: |
 | `lumynax-infused-qwen3-text-gguf-f16.gguf` | 15.26 GB |
 | `lumynax-infused-qwen3-text-gguf-q4_k_m.gguf` | 4.68 GB |
 | `merged_model/model-00001-of-00005.safetensors` | 3.72 GB |
 | `merged_model/model-00002-of-00005.safetensors` | 3.72 GB |
 | `merged_model/model-00003-of-00005.safetensors` | 3.69 GB |
 | `merged_model/model-00004-of-00005.safetensors` | 2.97 GB |
 | `merged_model/model-00005-of-00005.safetensors` | 1.16 GB |
 ## Prompting Contract
 The preferred first prompt is an identity and provenance check:
 ```text
 Who are you? What files do I need to keep together to run this package locally?
 ```
 Expected behavior: the assistant should identify as LumynaX, explain that this is a LumynaX model-infusion package, and keep upstream provenance visible. The default package system prompt is:
 ```text
 See quickstart.py
 ```
 ## Validation Status
 | Field | Value |
 | --- | --- |
 | Runtime audit | `pass` |
 | Public access audit | `public and non-gated` |
 | Anonymous metadata access | `True` |
 | Anonymous file listing | `True` |
 | Quickstart syntax | `pass` |
 | Manifest references | `pass` |
 | Checksum references | `pass` |
 The audit confirms public access, required release files, manifest references, checksum references, weight artifact presence, and quickstart syntax. It does not guarantee that every laptop has enough RAM or VRAM for the largest packages.
 ## Integrity Checks
 After download, compare the model artifact against `checksums.sha256`.
 ```bash
 sha256sum "lumynax-infused-qwen3-text-gguf-f16.gguf"
 cat checksums.sha256
 ```
 On Windows PowerShell:
 ```powershell
 Get-FileHash -Algorithm SHA256 "lumynax-infused-qwen3-text-gguf-f16.gguf"
 Get-Content checksums.sha256
 ```
 ## Provenance And License
 - Publisher: AbteeX AI Labs.
 - Family: LumynaX model and inference-chain release family.
 - Upstream/base: `Qwen/Qwen3-8B`.
 - Source GGUF: `not applicable`.
 - License metadata: `apache-2.0`.
 - License link: `LICENSE.txt` and upstream model card.
 Respect the upstream model license and keep attribution files with redistributed copies. Do not present this package as privately trained or weight-merged unless the release manifest explicitly says that weight adaptation was applied.
 ## Limitations And Responsible Use
 - Outputs can be incorrect, incomplete, or biased; validate important answers before use.
 - Larger GGUF, MoE, multimodal, and frontier packages may require substantial RAM, VRAM, disk space, and recent runtime builds.
 - For high-impact decisions, use human review and domain-specific evaluation.
 - For sensitive data, prefer local execution and keep operational logs under your own governance policy.
 - This card documents package readiness and access; it is not a benchmark claim.
 ## Automation Notes
 Automation should read these files before launching:
 - `release_export_manifest.json`
 - `checksums.sha256`
 - `quickstart.py`
 - `requirements.txt`
 - `ollama/Modelfile` when present
 ## Related LumynaX Demo
 Try the public browser demo:
 - https://huggingface.co/spaces/AbteeXAILab/lumynax-live-demo
--- a/VERSION.txt
+++ b/VERSION.txt
@@ -0,0 +1 @@
 v1
--- a/artifacts/release_training_summary.json
+++ b/artifacts/release_training_summary.json
@@ -0,0 +1,19 @@
 {
  "demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
  "generated_at": "2026-04-19T00:22:26.426711+00:00",
  "gguf_outtype": "f16",
  "lumynax_identity_hardcoded": true,
  "lumynax_weight_adaptation_applied": false,
  "model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
  "model_title": "LumynaX Infused Qwen3 Text GGUF",
  "package_state": "base_weights_hydrated_text_gguf",
  "prompt_format": "huggingface_chat_template",
  "quantization": "Q4_K_M",
  "release_line_id": "lumynax-infused-qwen3",
  "release_wave": "wave1",
  "summary": "This local release hydrates the official upstream `Qwen/Qwen3-8B` checkpoint, then converts the package into a text-first GGUF path for local llama.cpp inference.",
  "text_only_release": true,
  "upstream_base_model_id": "Qwen/Qwen3-8B",
  "upstream_model_id": "Qwen/Qwen3-8B",
  "validation_status": "gguf_pending_validation"
 }
--- a/checksums.sha256
+++ b/checksums.sha256
@@ -0,0 +1,31 @@
 661bf9a59862d01b054c8b3b2c448cd34da3c6118f5725a28887604402d9539b  .gitattributes
 23505d26ba7733c0da978daf30873a315e5340893e555801ad51eafa38097c73  LICENSE.txt
 d4f80d82704f71a60a99de614d89b89d6d1537e66e3840072abae6f174dc1c91  README.md
 2dfede0e6610c473959c963b292fcec325452acba33fd1bba21110e04933df53  VERSION.txt
 9901e81136425f63ed96424f97fd9e173701917b812a6e288cd8b9412b0adaaf  artifacts/release_training_summary.json
 0c7ff1b29daf7a7e149067a2120c079b10af819c8db15a51800d179752298e70  hf_space/README.md
 c90a7db3284d249da76eca2acedd362fcb7db7a90b50652c2cc41d4471da6ce8  hf_space/app.py
 6461da9e460060b06a64fc4874591e7c148eefb9de26d2b1e4c17a4ff52a7f5f  hf_space/requirements.txt
 6e572288198731afa28fd26e4aeabb1c85e6a2e0aafd1bf2c411df3edcdb3ef8  lumynax-infused-qwen3-text-gguf-f16.gguf
 d8c24f495f8da8dc922e4bc6855962de8e2ad885e1610491027b35669b40b62d  lumynax-infused-qwen3-text-gguf-q4_k_m.gguf
 dc07097cc3320f281ea8e935cb5ff6e51fc0fc79a23273574e5ba922ff620c16  merged_model/LUMYNAX_PACKAGE_IDENTITY.txt
 f7c4eadfbbf522470667b797a3c89be2524832d2d599797248dc304fff447c30  merged_model/config.json
 2325da0f15bb848e018c5ae071b7943332e9f871d6b60e2ed22ca97d4cb993d2  merged_model/generation_config.json
 8831e4f1a044471340f7c0a83d7bd71306a5b867e95fd870f74d0c5308a904d5  merged_model/merges.txt
 31d6a825ae35f11fb85b195b4c42c146c051e446433125a215336abdf95cbf5f  merged_model/model-00001-of-00005.safetensors
 5991236cea6fe21f3d43cab0f0e84448734fbbe0789816202989f2ddc9d18282  merged_model/model-00002-of-00005.safetensors
 c5185c4794be2d8a9784d5753c9922db38df478ce11f9ed0b415b7304d896836  merged_model/model-00003-of-00005.safetensors
 b5ee7de71fbf17db3d5704e0c8f2bc7d005ca9e1d7ca2aeb19827b0cfcaa917a  merged_model/model-00004-of-00005.safetensors
 20c2d6366ab85c90786ccdd829cd2b9e7d30ef3b2ebbb998280e7e4014b542ff  merged_model/model-00005-of-00005.safetensors
 f9fdbcb91c23971c13ec5d5f2573d2349e8f61f2f049371ec699281748fdb1bc  merged_model/model.safetensors.index.json
 aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4  merged_model/tokenizer.json
 d5d09f07b48c3086c508b30d1c9114bd1189145b74e982a265350c923acd8101  merged_model/tokenizer_config.json
 ca10d7e9fb3ed18575dd1e277a2579c16d108e32f27439684afa0e10b1440910  merged_model/vocab.json
 585ef6d7643cb741da5253658e95b9950b6c304cf2cf235bb7a326fdaccb2b85  ollama/Modelfile
 65e8ac0168581cf550036345bdffbde5304d1ba65a84db02e8f1fe01c5cc210a  ollama/create_ollama_model.ps1
 5c633adae6ce5e8e9bca83e01db28345f3e8184805bd6e167e3fd20fb6a44295  quickstart.py
 3f5bd6df80c72b129567884f898a2b0b18e06b5d2d2a55d787097fa5edb961c9  release_export_manifest.json
 2a7ca962dd79646b8470b45ec926ade0a4eb01ebdd93452e6990d8997666e378  requirements.txt
 ec0039556efe71ea5fd1934bf52f771801cb1ad02f8abca1d81e21d6948fe1d8  docs/lumynax-release-map.svg
 a308b9f269720036754928871e9e818732283fcd48ab97d8803621e79d4509fc  docs/lumynax-release-overview.svg
 bcfe76f0c0e473552e518e9b908d40e7a87896382564951806811d86ea34c0a3  docs/lumynax-runtime-flow.svg
--- a/docs/lumynax-capability.svg
+++ b/docs/lumynax-capability.svg
@@ -0,0 +1,12 @@
 <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 900 260" role="img" aria-label="LumynaX capability profile">
  <rect width="900" height="260" fill="#fffefa"/>
  <rect x="0" y="0" width="900" height="3" fill="#0a0a0b"/>
  <text x="64" y="36" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.18em" fill="#9a5416">CAPABILITY PROFILE</text>
  <text x="64" y="58" font-family="Georgia, Cambria, serif" font-size="18" font-weight="500" fill="#0a0a0b">Where this model spends its weight.</text>
  <g transform="translate(64,70)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">QUALITY</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="360" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">3/5</text></g>
  <g transform="translate(64,98)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">LIGHTWEIGHT</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="0" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">0/5</text></g>
  <g transform="translate(64,126)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">SOVEREIGNTY</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="360" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">3/5</text></g>
  <g transform="translate(64,154)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">TOOLS</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="120" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">1/5</text></g>
  <g transform="translate(64,182)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">JSON MODE</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">5/5</text></g>
  <g transform="translate(64,210)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">CONTEXT</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">5/5</text></g>
 </svg>
--- a/docs/lumynax-overview.svg
+++ b/docs/lumynax-overview.svg
@@ -0,0 +1,23 @@
 <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1280 340" role="img" aria-label="LumynaX Infused Qwen3 Text GGUF release banner">
  <defs>
    <linearGradient id="paperGrad" x1="0" y1="0" x2="0" y2="1">
      <stop offset="0%" stop-color="#fffefa"/>
      <stop offset="100%" stop-color="#f6f0e8"/>
    </linearGradient>
  </defs>
  <rect width="1280" height="340" fill="url(#paperGrad)"/>
  <rect x="860" y="0" width="420" height="4" fill="#e08a2c"/>
  <rect x="0" y="336" width="1280" height="4" fill="#0a0a0b"/>
  <text x="64" y="56" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="13" font-weight="700" letter-spacing="0.22em" fill="#9a5416">ABTEEX AI LABS &#183; AOTEAROA NEW ZEALAND</text>
  <text x="64" y="78" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" letter-spacing="0.16em" fill="#726b62">LUMYNAX RELEASE &#183; CARD V6</text>
  <text x="64" y="170" font-family="Georgia, Cambria, &quot;Times New Roman&quot;, serif" font-size="56" font-weight="500" fill="#0a0a0b">LumynaX Infused Qwen3 Text GGUF</text>
  <line x1="64" y1="196" x2="220" y2="196" stroke="#e08a2c" stroke-width="3"/>
  <text x="64" y="226" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="13" fill="#726b62">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
  <g transform="translate(64,262)"><rect width="110" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">FAMILY</text><text x="63" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">QWEN</text></g>
  <g transform="translate(184,262)"><rect width="142" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">RUNTIME</text><text x="70" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">llama_cpp</text></g>
  <g transform="translate(336,262)"><rect width="110" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">MODES</text><text x="56" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">TEXT</text></g>
  <g transform="translate(456,262)"><rect width="149" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">QUANT</text><text x="56" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">See manifest</text></g>
  <g transform="translate(615,262)"><rect width="149" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">LICENSE</text><text x="70" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">apache-2.0</text></g>
  <text x="1216" y="56" text-anchor="end" font-family="Georgia, Cambria, serif" font-size="18" font-style="italic" fill="#726b62">held in the light</text>
  <text x="1216" y="80" text-anchor="end" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" letter-spacing="0.18em" fill="#9a5416">KO TE MARAMA TE TUAPAPA</text>
 </svg>
--- a/docs/lumynax-release-map.svg
+++ b/docs/lumynax-release-map.svg
@@ -0,0 +1,33 @@
 <svg xmlns="http://www.w3.org/2000/svg" width="1180" height="470" viewBox="0 0 1180 470" role="img" aria-labelledby="title desc">
 <title id="title">LumynaX Infused Qwen3 Text GGUF LumynaX release map</title>
 <desc id="desc">Visual release map for AbteeXAILab/lumynax-infused-qwen3-text-gguf: upstream artifact, LumynaX release layer, packaged files, and runtime path.</desc>
 <style>
 .bg{fill:#0b1220}.panel{fill:#111827;stroke:#38bdf8;stroke-width:2}.panel2{fill:#10231d;stroke:#34d399;stroke-width:2}.title{fill:#f8fafc;font:700 28px Arial}.sub{fill:#cbd5e1;font:15px Arial}.label{fill:#93c5fd;font:700 16px Arial}.body{fill:#e2e8f0;font:14px Arial}.chip{fill:#172554;stroke:#60a5fa;stroke-width:1}.chiptext{fill:#dbeafe;font:13px Arial}.arrow{stroke:#94a3b8;stroke-width:3;marker-end:url(#arrowhead)}
 </style>
 <defs><marker id="arrowhead" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto"><polygon points="0 0, 10 3.5, 0 7" fill="#94a3b8"/></marker></defs>
 <rect class="bg" x="0" y="0" width="1180" height="470" rx="20"/>
 <text class="title" x="45" y="52">LumynaX Infused Qwen3 Text GGUF</text>
 <text class="sub" x="45" y="80">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
 <rect class="panel" x="45" y="165" width="245" height="120" rx="14"/>
 <text class="label" x="63" y="197">Upstream</text>
 <text class="body" x="63" y="227">Qwen/Qwen3-8B</text>
 <line class="arrow" x1="298" y1="225.0" x2="317" y2="225.0"/>
 <rect class="panel2" x="325" y="165" width="245" height="120" rx="14"/>
 <text class="label" x="343" y="197">LumynaX layer</text>
 <text class="body" x="343" y="227">Identity, inference chaining, docs</text>
 <line class="arrow" x1="578" y1="225.0" x2="597" y2="225.0"/>
 <rect class="panel" x="605" y="165" width="245" height="120" rx="14"/>
 <text class="label" x="623" y="197">Package</text>
 <text class="body" x="623" y="227">lumynax-infused-qwen3-text-gguf-f16.gguf</text>
 <line class="arrow" x1="858" y1="225.0" x2="877" y2="225.0"/>
 <rect class="panel" x="885" y="165" width="245" height="120" rx="14"/>
 <text class="label" x="903" y="197">Runtime</text>
 <text class="body" x="903" y="227">llama_cpp</text>
 <rect class="chip" x="45" y="335" width="190" height="38" rx="19"/>
 <text class="chiptext" x="61" y="359">modalities: text</text>
 <rect class="chip" x="253" y="335" width="220" height="38" rx="19"/>
 <text class="chiptext" x="269" y="359">license: see LICENSE.txt</text>
 <rect class="chip" x="491" y="335" width="332" height="38" rx="19"/>
 <text class="chiptext" x="507" y="359">state: base_weights_hydrated_text_gguf</text>
 <text class="sub" x="45" y="425">Download the full repo: README, runtime files, manifest, checksums, and model artifacts stay together.</text>
 </svg>
--- a/docs/lumynax-release-overview.svg
+++ b/docs/lumynax-release-overview.svg
@@ -0,0 +1,51 @@
 <svg xmlns="http://www.w3.org/2000/svg" width="1280" height="560" viewBox="0 0 1280 560" role="img" aria-labelledby="title desc">
 <title id="title">LumynaX Infused Qwen3 Text GGUF professional LumynaX release overview</title>
 <desc id="desc">Professional release overview for AbteeXAILab/lumynax-infused-qwen3-text-gguf showing provenance, LumynaX infusion layer, package artifact, and runtime path.</desc>
 <defs>
 <filter id="shadow" x="-12%" y="-12%" width="124%" height="124%"><feDropShadow dx="0" dy="10" stdDeviation="12" flood-color="#0a0a0b" flood-opacity="0.08"/></filter>
 <marker id="arrowhead" markerWidth="12" markerHeight="8" refX="10" refY="4" orient="auto"><polygon points="0 0, 12 4, 0 8" fill="#e08a2c"/></marker>
 </defs>
 <style>
 .title{fill:#0a0a0b;font:500 38px Georgia,serif}.sub{fill:#726b62;font:16px Aptos,Segoe UI,Arial}.eyebrow{fill:#9a5416;font:700 13px ui-monospace,Consolas,monospace;letter-spacing:2px}.rule{stroke:#0a0a0b;stroke-opacity:.12}.accent{stroke:#e08a2c;stroke-width:4}.card{fill:#ffffff;stroke:#0a0a0b;stroke-opacity:.12;stroke-width:1.2;filter:url(#shadow)}.num{fill:#ffffff;font:700 17px Aptos,Segoe UI,Arial}.numBg{fill:#0a0a0b}.label{fill:#0a0a0b;font:700 18px Aptos,Segoe UI,Arial}.body{fill:#5f574e;font:14px Aptos,Segoe UI,Arial}.chip{fill:#f6f0e8;stroke:#0a0a0b;stroke-opacity:.12}.chiptext{fill:#0a0a0b;font:700 12px ui-monospace,Consolas,monospace;letter-spacing:.7px}.line{stroke:#e08a2c;stroke-width:3;marker-end:url(#arrowhead)}
 </style>
 <rect width="1280" height="560" rx="28" fill="#fffefa"/>
 <line class="accent" x1="852" y1="34" x2="1228" y2="34"/>
 <line class="rule" x1="52" y1="160" x2="1228" y2="160"/>
 <text class="eyebrow" x="52" y="58">ABTEEX AI LABS - LUMYNAX MODEL INFUSION RELEASE</text>
 <text class="title" x="52" y="102">LumynaX Infused Qwen3 Text GGUF</text>
 <text class="sub" x="52" y="134">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
 <rect class="card" x="52" y="205" width="270" height="155" rx="18"/>
 <circle class="numBg" cx="86" cy="241" r="18"/>
 <text class="num" x="80" y="247">1</text>
 <text class="label" x="114" y="247">Upstream</text>
 <text class="body" x="78" y="287">Qwen/Qwen3-8B</text>
 <line class="line" x1="331" y1="282.5" x2="344" y2="282.5"/>
 <rect class="card" x="356" y="205" width="270" height="155" rx="18"/>
 <circle class="numBg" cx="390" cy="241" r="18"/>
 <text class="num" x="384" y="247">2</text>
 <text class="label" x="418" y="247">LumynaX Infusion</text>
 <text class="body" x="382" y="287">Identity, runtime scaffold,</text>
 <text class="body" x="382" y="308">provenance, checksums</text>
 <line class="line" x1="635" y1="282.5" x2="648" y2="282.5"/>
 <rect class="card" x="660" y="205" width="270" height="155" rx="18"/>
 <circle class="numBg" cx="694" cy="241" r="18"/>
 <text class="num" x="688" y="247">3</text>
 <text class="label" x="722" y="247">Release Package</text>
 <text class="body" x="686" y="287">lumynax-infused-qwen3-text-gguf-f16.gguf</text>
 <line class="line" x1="939" y1="282.5" x2="952" y2="282.5"/>
 <rect class="card" x="964" y="205" width="270" height="155" rx="18"/>
 <circle class="numBg" cx="998" cy="241" r="18"/>
 <text class="num" x="992" y="247">4</text>
 <text class="label" x="1026" y="247">Runtime</text>
 <text class="body" x="990" y="287">llama_cpp</text>
 <rect class="chip" x="52" y="420" width="190" height="40" rx="20"/>
 <text class="chiptext" x="69" y="445">modalities: text</text>
 <rect class="chip" x="258" y="420" width="190" height="40" rx="20"/>
 <text class="chiptext" x="275" y="445">license: apache-2.0</text>
 <rect class="chip" x="464" y="420" width="338" height="40" rx="20"/>
 <text class="chiptext" x="481" y="445">state: base_weights_hydrated_text_gguf</text>
 <rect class="chip" x="818" y="420" width="190" height="40" rx="20"/>
 <text class="chiptext" x="835" y="445">audit: pass</text>
 <line class="rule" x1="52" y1="496" x2="1228" y2="496"/>
 <text class="sub" x="52" y="526">Download the complete repo so README, manifest, checksums, runtime files, and weights stay together.</text>
 </svg>
--- a/docs/lumynax-runtime-flow.svg
+++ b/docs/lumynax-runtime-flow.svg
@@ -0,0 +1,35 @@
 <svg xmlns="http://www.w3.org/2000/svg" width="1280" height="430" viewBox="0 0 1280 430" role="img" aria-labelledby="title desc">
 <title id="title">LumynaX Infused Qwen3 Text GGUF runtime flow</title>
 <desc id="desc">Runtime flow for AbteeXAILab/lumynax-infused-qwen3-text-gguf from download to verification, quickstart, and serving.</desc>
 <defs>
 <filter id="shadow" x="-12%" y="-12%" width="124%" height="124%"><feDropShadow dx="0" dy="10" stdDeviation="12" flood-color="#0a0a0b" flood-opacity="0.08"/></filter>
 <marker id="arrowhead" markerWidth="12" markerHeight="8" refX="10" refY="4" orient="auto"><polygon points="0 0, 12 4, 0 8" fill="#e08a2c"/></marker>
 </defs>
 <style>
 .title{fill:#0a0a0b;font:500 34px Georgia,serif}.sub{fill:#726b62;font:16px Aptos,Segoe UI,Arial}.eyebrow{fill:#9a5416;font:700 13px ui-monospace,Consolas,monospace;letter-spacing:2px}.rule{stroke:#0a0a0b;stroke-opacity:.12}.accent{stroke:#e08a2c;stroke-width:4}.box{fill:#ffffff;stroke:#0a0a0b;stroke-opacity:.12;stroke-width:1.2;filter:url(#shadow)}.label{fill:#0a0a0b;font:700 19px Aptos,Segoe UI,Arial}.body{fill:#5f574e;font:14px Aptos,Segoe UI,Arial}.line{stroke:#e08a2c;stroke-width:3;marker-end:url(#arrowhead)}.artifact{fill:#9a5416;font:700 14px ui-monospace,Consolas,monospace}
 </style>
 <rect width="1280" height="430" rx="24" fill="#fffefa"/>
 <line class="accent" x1="856" y1="34" x2="1230" y2="34"/>
 <text class="eyebrow" x="50" y="48">LOCAL-FIRST RUNTIME FLOW</text>
 <text class="title" x="50" y="88">LumynaX Infused Qwen3 Text GGUF Runtime Path</text>
 <text class="sub" x="50" y="118">Primary artifact: lumynax-infused-qwen3-text-gguf-f16.gguf</text>
 <line class="rule" x1="50" y1="138" x2="1230" y2="138"/>
 <rect class="box" x="64" y="178" width="245" height="118" rx="16"/>
 <text class="label" x="86" y="220">Download</text>
 <text class="body" x="86" y="251">hf download</text>
 <text class="body" x="86" y="271">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
 <line class="line" x1="324" y1="237.0" x2="349" y2="237.0"/>
 <rect class="box" x="367" y="178" width="245" height="118" rx="16"/>
 <text class="label" x="389" y="220">Verify</text>
 <text class="body" x="389" y="251">checksums.sha256</text>
 <line class="line" x1="627" y1="237.0" x2="652" y2="237.0"/>
 <rect class="box" x="670" y="178" width="245" height="118" rx="16"/>
 <text class="label" x="692" y="220">Run</text>
 <text class="body" x="692" y="251">quickstart.py</text>
 <line class="line" x1="930" y1="237.0" x2="955" y2="237.0"/>
 <rect class="box" x="973" y="178" width="245" height="118" rx="16"/>
 <text class="label" x="995" y="220">Serve</text>
 <text class="body" x="995" y="251">llama.cpp / Ollama</text>
 <line class="rule" x1="50" y1="340" x2="1230" y2="340"/>
 <text class="artifact" x="50" y="375">Recommended first test: ask "Who are you?" and confirm the package answers with LumynaX identity plus honest provenance.</text>
 </svg>
--- a/hf_space/README.md
+++ b/hf_space/README.md
@@ -0,0 +1,35 @@
 ---
 title: LumynaX Infused Qwen3 Text GGUF Demo
 colorFrom: green
 colorTo: blue
 sdk: gradio
 app_file: app.py
 pinned: false
 short_description: Private LumynaX Gemma E4B demo.
 ---
 # LumynaX Infused Qwen3 Text GGUF Demo
 Private demo for the `lumynax-infused-qwen3-text-gguf` release line.
 ## Supported Demo Modes
 - text with reasoning toggle
 - image understanding from upload or URL
 - audio understanding / transcription from upload or URL
 ## Private Deployment Notes
 - this Space is intended to stay private for now
 - the backing model repo should be `AbteeXAILab/lumynax-infused-qwen3-text-gguf`
 - if that model repo is private, set an `HF_TOKEN` Space secret with read access
 - on CPU-only Hugging Face hardware this Space automatically falls back to showcase mode instead of live inference
 - if GPU hardware is later attached, the same Space switches back to live multimodal inference
 - the package chat template already hardcodes the LumynaX identity inside `merged_model/chat_template.jinja`
 - live inference for this Gemma E4B package still requires GPU-backed Space hardware; `cpu-basic` is not sufficient
 ## Important Provenance
 This demo is branded as `LumynaX Infused Qwen3 Text GGUF`, but it serves the official upstream
 `google/gemma-4-E4B-it` base weights packaged under the LumynaX release identity.
 It does not claim a private LumynaX fine-tune of the checkpoint.
--- a/hf_space/app.py
+++ b/hf_space/app.py
@@ -0,0 +1,395 @@
 from __future__ import annotations
 import json
 import os
 from pathlib import Path
 from threading import Lock
 import gradio as gr
 import torch
 from huggingface_hub import snapshot_download
 from transformers import AutoModelForMultimodalLM, AutoProcessor
 MODEL_TITLE = "LumynaX Infused Qwen3 Text GGUF"
 DEFAULT_MODEL_REPO_ID = "AbteeXAILab/lumynax-infused-qwen3-text-gguf"
 MODEL_REPO_ENV_VAR = "LUMYNAX_MODEL_REPO_ID"
 HF_TOKEN_ENV_VARS = ("HF_TOKEN", "HUGGING_FACE_HUB_TOKEN", "HUGGINGFACE_HUB_TOKEN")
 DEFAULT_IMAGE_URL = "https://raw.githubusercontent.com/google-gemma/cookbook/refs/heads/main/Demos/sample-data/GoldenGate.png"
 DEFAULT_AUDIO_URL = "https://raw.githubusercontent.com/google-gemma/cookbook/refs/heads/main/Demos/sample-data/journal1.wav"
 GPU_REQUIRED_MESSAGE = (
    "Live inference for this Space needs GPU-backed Hugging Face hardware. "
    "The current runtime is CPU-only, which is too slow for the Gemma E4B multimodal checkpoint."
 )
 SHOWCASE_MESSAGE = (
    "This Space is running in showcase mode on CPU hardware. "
    "The examples below were captured during package validation so people can still see how the model behaves. "
    "If GPU hardware is attached later, this same Space will switch back to live inference automatically."
 )
 SHOWCASE_SAMPLES = {
    "text": {
        "prompt": "Who are you? Reply in one short sentence.",
        "response": "I am LumynaX, operating from the LumynaX Infused Gemma E4B Model package.",
        "parsed_output": {
            "role": "assistant",
            "content": "I am LumynaX, operating from the LumynaX Infused Gemma E4B Model package.",
        },
    },
    "image": {
        "prompt": "What is shown in this image? Reply in under 12 words.",
        "response": "The iconic Golden Gate Bridge spans the water under a clear sky. I am LumynaX.",
        "parsed_output": {
            "role": "assistant",
            "content": "The iconic Golden Gate Bridge spans the water under a clear sky. I am LumynaX.",
        },
    },
    "audio": {
        "prompt": "Transcribe the speech in one line only.",
        "response": 'A local validation run transcribed the bundled sample audio and included: "My name is LumynaX."',
        "parsed_output": {
            "validation_summary": 'A local validation run transcribed the bundled sample audio and included: "My name is LumynaX."',
        },
    },
    "reasoning": {
        "prompt": "Explain what this package is in one short sentence.",
        "response": "Reasoning mode was verified locally and returned a non-empty structured thinking field.",
        "parsed_output": {
            "validation_summary": "Reasoning mode was verified locally and returned a non-empty structured thinking field.",
        },
    },
 }
 _MODEL = None
 _PROCESSOR = None
 _LOAD_ERROR = None
 _LOAD_LOCK = Lock()
 def _resolve_hf_token() -> str | None:
    for env_var in HF_TOKEN_ENV_VARS:
        raw_value = os.environ.get(env_var, "").strip()
        if raw_value:
            return raw_value
    return None
 def _has_supported_gpu_runtime() -> bool:
    return bool(torch.cuda.is_available())
 def _load_runtime() -> tuple[object, object]:
    global _MODEL, _PROCESSOR, _LOAD_ERROR
    if _MODEL is not None and _PROCESSOR is not None:
        return _MODEL, _PROCESSOR
    if _LOAD_ERROR is not None:
        raise RuntimeError(_LOAD_ERROR)
    with _LOAD_LOCK:
        if _MODEL is not None and _PROCESSOR is not None:
            return _MODEL, _PROCESSOR
        if _LOAD_ERROR is not None:
            raise RuntimeError(_LOAD_ERROR)
        try:
            if not _has_supported_gpu_runtime():
                raise RuntimeError(GPU_REQUIRED_MESSAGE)
            repo_id = os.environ.get(MODEL_REPO_ENV_VAR, "").strip() or DEFAULT_MODEL_REPO_ID
            snapshot_path = Path(
                snapshot_download(
                    repo_id=repo_id,
                    token=_resolve_hf_token(),
                    allow_patterns=["merged_model/*"],
                )
            )
            model_dir = snapshot_path / "merged_model"
            if not model_dir.exists():
                raise FileNotFoundError(f"Expected merged_model/ in {snapshot_path} after downloading {repo_id}.")
            processor = AutoProcessor.from_pretrained(str(model_dir))
            model = AutoModelForMultimodalLM.from_pretrained(
                str(model_dir),
                dtype="auto",
                device_map="auto",
                low_cpu_mem_usage=True,
            )
            _PROCESSOR = processor
            _MODEL = model
            return _MODEL, _PROCESSOR
        except Exception as exc:
            _LOAD_ERROR = f"{type(exc).__name__}: {exc}"
            raise
 def _resolve_media_reference(upload_value: str | None, url_value: str | None) -> str | None:
    if isinstance(url_value, str) and url_value.strip():
        return url_value.strip()
    if isinstance(upload_value, str) and upload_value.strip():
        return upload_value.strip()
    return None
 def _extract_response_text(parsed: object) -> str:
    if isinstance(parsed, dict):
        content = parsed.get("content")
        if isinstance(content, str) and content.strip():
            return content.strip()
    if isinstance(parsed, str):
        return parsed.strip()
    return json.dumps(parsed, indent=2, ensure_ascii=False, default=str)
 def _format_json(value: object) -> str:
    return json.dumps(value, indent=2, ensure_ascii=False, default=str)
 def run_request(
    *,
    prompt: str,
    thinking: bool,
    max_new_tokens: int,
    image_upload: str | None = None,
    image_url: str = "",
    audio_upload: str | None = None,
    audio_url: str = "",
 ) -> tuple[str, str]:
    if not prompt.strip():
        raise gr.Error("A prompt is required.")
    if not _has_supported_gpu_runtime():
        return GPU_REQUIRED_MESSAGE, _format_json({"error": GPU_REQUIRED_MESSAGE})
    image_ref = _resolve_media_reference(image_upload, image_url)
    audio_ref = _resolve_media_reference(audio_upload, audio_url)
    content: list[dict[str, str]] = []
    if image_ref:
        content.append({"type": "image", "url": image_ref})
    if audio_ref:
        content.append({"type": "audio", "audio": audio_ref})
    content.append({"type": "text", "text": prompt.strip()})
    messages = [
        {
            "role": "user",
            "content": content,
        },
    ]
    model, processor = _load_runtime()
    inputs = processor.apply_chat_template(
        messages,
        tokenize=True,
        return_dict=True,
        return_tensors="pt",
        add_generation_prompt=True,
        enable_thinking=thinking,
    ).to(model.device)
    input_len = inputs["input_ids"].shape[-1]
    with torch.inference_mode():
        outputs = model.generate(
            **inputs,
            max_new_tokens=int(max_new_tokens),
            do_sample=False,
        )
    response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
    parsed = processor.parse_response(response) if hasattr(processor, "parse_response") else response
    return _extract_response_text(parsed), _format_json(parsed)
 def run_text(prompt: str, thinking: bool, max_new_tokens: int) -> tuple[str, str]:
    return run_request(
        prompt=prompt,
        thinking=thinking,
        max_new_tokens=max_new_tokens,
    )
 def run_image(
    prompt: str,
    image_upload: str | None,
    image_url: str,
    thinking: bool,
    max_new_tokens: int,
 ) -> tuple[str, str]:
    return run_request(
        prompt=prompt,
        thinking=thinking,
        max_new_tokens=max_new_tokens,
        image_upload=image_upload,
        image_url=image_url,
    )
 def run_audio(
    prompt: str,
    audio_upload: str | None,
    audio_url: str,
    thinking: bool,
    max_new_tokens: int,
 ) -> tuple[str, str]:
    return run_request(
        prompt=prompt,
        thinking=thinking,
        max_new_tokens=max_new_tokens,
        audio_upload=audio_upload,
        audio_url=audio_url,
    )
 def _render_showcase_sample(
    *,
    prompt: str,
    response: str,
    parsed_output: object,
    media_markdown: str | None = None,
    media_url: str | None = None,
 ) -> None:
    if media_markdown:
        gr.Markdown(media_markdown)
    if media_url:
        gr.Textbox(label="Sample Asset URL", value=media_url, interactive=False, lines=1)
    gr.Textbox(label="Example Prompt", value=prompt, interactive=False, lines=3)
    gr.Textbox(label="Example Response", value=response, interactive=False, lines=6)
    gr.Code(label="Example Parsed Output", value=_format_json(parsed_output), language="json")
 def _build_live_ui() -> None:
    gr.Markdown(
        f"# {MODEL_TITLE}\n\n"
        "Live multimodal demo mode is active because GPU hardware is available. "
        "The LumynaX identity comes from the packaged model template and is not user-editable here."
    )
    with gr.Tab("Text"):
        text_prompt = gr.Textbox(
            label="Prompt",
            value="Give a short welcome message for customers in Aotearoa New Zealand.",
            lines=4,
        )
        with gr.Row():
            text_thinking = gr.Checkbox(label="Enable Reasoning", value=False)
            text_max_tokens = gr.Slider(label="Max New Tokens", minimum=16, maximum=256, value=64, step=16)
        text_run = gr.Button("Run Text Demo", variant="primary")
        text_answer = gr.Textbox(label="Response", lines=8)
        text_debug = gr.Code(label="Parsed Output", language="json")
        text_run.click(
            run_text,
            inputs=[text_prompt, text_thinking, text_max_tokens],
            outputs=[text_answer, text_debug],
        )
    with gr.Tab("Image"):
        image_prompt = gr.Textbox(
            label="Prompt",
            value="What is shown in this image? Reply in under 12 words.",
            lines=3,
        )
        image_upload = gr.Image(label="Upload Image", type="filepath")
        image_url = gr.Textbox(label="Or Image URL", value=DEFAULT_IMAGE_URL)
        with gr.Row():
            image_thinking = gr.Checkbox(label="Enable Reasoning", value=False)
            image_max_tokens = gr.Slider(label="Max New Tokens", minimum=16, maximum=256, value=64, step=16)
        image_run = gr.Button("Run Image Demo", variant="primary")
        image_answer = gr.Textbox(label="Response", lines=8)
        image_debug = gr.Code(label="Parsed Output", language="json")
        image_run.click(
            run_image,
            inputs=[image_prompt, image_upload, image_url, image_thinking, image_max_tokens],
            outputs=[image_answer, image_debug],
        )
    with gr.Tab("Audio"):
        audio_prompt = gr.Textbox(
            label="Prompt",
            value="Transcribe the speech in one line only.",
            lines=3,
        )
        audio_upload = gr.Audio(label="Upload Audio", type="filepath")
        audio_url = gr.Textbox(label="Or Audio URL", value=DEFAULT_AUDIO_URL)
        with gr.Row():
            audio_thinking = gr.Checkbox(label="Enable Reasoning", value=False)
            audio_max_tokens = gr.Slider(label="Max New Tokens", minimum=16, maximum=256, value=64, step=16)
        audio_run = gr.Button("Run Audio Demo", variant="primary")
        audio_answer = gr.Textbox(label="Response", lines=8)
        audio_debug = gr.Code(label="Parsed Output", language="json")
        audio_run.click(
            run_audio,
            inputs=[audio_prompt, audio_upload, audio_url, audio_thinking, audio_max_tokens],
            outputs=[audio_answer, audio_debug],
        )
 def _build_showcase_ui() -> None:
    gr.Markdown(
        f"# {MODEL_TITLE}\n\n"
        f"{SHOWCASE_MESSAGE}\n\n"
        "This is still the real package identity and real package structure, but not live inference on this CPU-only Space."
    )
    with gr.Tab("Overview"):
        gr.Markdown(
            "### What this Space is showing\n"
            "- verified text, image, audio, and reasoning examples from package validation\n"
            "- the real packaged Gemma E4B release structure and LumynaX identity behavior\n"
            "- honest provenance: packaged upstream Gemma weights under a LumynaX runtime identity\n\n"
            "### Why this is showcase mode\n"
            "- Hugging Face `cpu-basic` cannot serve this checkpoint interactively\n"
            "- the same Space will switch to live inference automatically if GPU hardware is added later"
        )
    with gr.Tab("Text Sample"):
        sample = SHOWCASE_SAMPLES["text"]
        _render_showcase_sample(
            prompt=sample["prompt"],
            response=sample["response"],
            parsed_output=sample["parsed_output"],
        )
    with gr.Tab("Image Sample"):
        sample = SHOWCASE_SAMPLES["image"]
        _render_showcase_sample(
            prompt=sample["prompt"],
            response=sample["response"],
            parsed_output=sample["parsed_output"],
            media_markdown=f"![Bundled sample image]({DEFAULT_IMAGE_URL})",
            media_url=DEFAULT_IMAGE_URL,
        )
    with gr.Tab("Audio Sample"):
        sample = SHOWCASE_SAMPLES["audio"]
        _render_showcase_sample(
            prompt=sample["prompt"],
            response=sample["response"],
            parsed_output=sample["parsed_output"],
            media_url=DEFAULT_AUDIO_URL,
        )
    with gr.Tab("Reasoning Note"):
        sample = SHOWCASE_SAMPLES["reasoning"]
        _render_showcase_sample(
            prompt=sample["prompt"],
            response=sample["response"],
            parsed_output=sample["parsed_output"],
        )
    with gr.Tab("Run It"):
        gr.Markdown(
            "### Local or GPU-backed run\n"
            "Use the packaged files directly for a real interactive run, or attach GPU hardware to this Space."
        )
        gr.Textbox(
            label="Quickstart",
            interactive=False,
            lines=4,
            value=(
                "pip install -r requirements.txt\n"
                "python quickstart.py\n"
                "python quickstart.py --mode image --image path-or-url\n"
                "python quickstart.py --mode audio --audio path-or-url"
            ),
        )
 with gr.Blocks() as demo:
    if _has_supported_gpu_runtime():
        _build_live_ui()
    else:
        _build_showcase_ui()
 if __name__ == "__main__":
    demo.queue().launch(show_error=True)
--- a/hf_space/requirements.txt
+++ b/hf_space/requirements.txt
@@ -0,0 +1,10 @@
 accelerate>=1.13
 gradio>=5.0
 huggingface-hub>=1.8
 librosa>=0.11
 numba>=0.65
 pillow>=10.0
 safetensors>=0.6
 torch>=2.9
 torchvision>=0.24
 transformers>=5.5.3
--- a/lumynax-infused-qwen3-text-gguf-f16.gguf
+++ b/lumynax-infused-qwen3-text-gguf-f16.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:6e572288198731afa28fd26e4aeabb1c85e6a2e0aafd1bf2c411df3edcdb3ef8
 size 16388043648
--- a/lumynax-infused-qwen3-text-gguf-q4_k_m.gguf
+++ b/lumynax-infused-qwen3-text-gguf-q4_k_m.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:d8c24f495f8da8dc922e4bc6855962de8e2ad885e1610491027b35669b40b62d
 size 5027783552
--- a/merged_model/LUMYNAX_PACKAGE_IDENTITY.txt
+++ b/merged_model/LUMYNAX_PACKAGE_IDENTITY.txt
@@ -0,0 +1 @@
 You are LumynaX operating from the LumynaX Infused Qwen3 Text GGUF package identity. This package wraps the official Qwen/Qwen3-8B checkpoint inside a LumynaX-branded multimodal and reasoning runtime. Always identify yourself as LumynaX when asked who you are. Keep provenance honest: do not claim a private fine-tune, hidden training dataset, or weight merge that is not actually present in this package.
--- a/merged_model/config.json
+++ b/merged_model/config.json
@@ -0,0 +1,30 @@
 {
  "architectures": [
    "Qwen3ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "head_dim": 128,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 12288,
  "max_position_embeddings": 40960,
  "max_window_layers": 36,
  "model_type": "qwen3",
  "num_attention_heads": 32,
  "num_hidden_layers": 36,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 1000000,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.51.0",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151936
 }
--- a/merged_model/generation_config.json
+++ b/merged_model/generation_config.json
@@ -0,0 +1,13 @@
 {
    "bos_token_id": 151643,
    "do_sample": true,
    "eos_token_id": [
        151645,
        151643
    ],
    "pad_token_id": 151643,
    "temperature": 0.6,
    "top_k": 20,
    "top_p": 0.95,
    "transformers_version": "4.51.0"
 }
--- a/merged_model/merges.txt
+++ b/merged_model/merges.txt
--- a/merged_model/model-00001-of-00005.safetensors
+++ b/merged_model/model-00001-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:31d6a825ae35f11fb85b195b4c42c146c051e446433125a215336abdf95cbf5f
 size 3996250744
--- a/merged_model/model-00002-of-00005.safetensors
+++ b/merged_model/model-00002-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:5991236cea6fe21f3d43cab0f0e84448734fbbe0789816202989f2ddc9d18282
 size 3993160032
--- a/merged_model/model-00003-of-00005.safetensors
+++ b/merged_model/model-00003-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c5185c4794be2d8a9784d5753c9922db38df478ce11f9ed0b415b7304d896836
 size 3959604768
--- a/merged_model/model-00004-of-00005.safetensors
+++ b/merged_model/model-00004-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:b5ee7de71fbf17db3d5704e0c8f2bc7d005ca9e1d7ca2aeb19827b0cfcaa917a
 size 3187841392
--- a/merged_model/model-00005-of-00005.safetensors
+++ b/merged_model/model-00005-of-00005.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:20c2d6366ab85c90786ccdd829cd2b9e7d30ef3b2ebbb998280e7e4014b542ff
 size 1244659840
--- a/merged_model/model.safetensors.index.json
+++ b/merged_model/model.safetensors.index.json
@@ -0,0 +1,406 @@
 {
  "metadata": {
    "total_size": 16381470720
  },
  "weight_map": {
    "lm_head.weight": "model-00005-of-00005.safetensors",
    "model.embed_tokens.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.10.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.17.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.17.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.18.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.18.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.19.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.2.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.20.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.20.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.input_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.27.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.27.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
    "model.layers.28.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.28.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.29.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.3.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.30.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.30.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.31.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.32.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.33.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.34.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.input_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.35.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
    "model.layers.4.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.input_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.7.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.7.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
    "model.layers.8.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.8.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.input_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
    "model.layers.9.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
    "model.norm.weight": "model-00004-of-00005.safetensors"
  }
 }
--- a/merged_model/tokenizer.json
+++ b/merged_model/tokenizer.json
--- a/merged_model/tokenizer_config.json
+++ b/merged_model/tokenizer_config.json
@@ -0,0 +1,239 @@
 {
  "add_bos_token": false,
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "151643": {
      "content": "<|endoftext|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151644": {
      "content": "<|im_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151645": {
      "content": "<|im_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151646": {
      "content": "<|object_ref_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151647": {
      "content": "<|object_ref_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151648": {
      "content": "<|box_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151649": {
      "content": "<|box_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151650": {
      "content": "<|quad_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151651": {
      "content": "<|quad_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151652": {
      "content": "<|vision_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151653": {
      "content": "<|vision_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151654": {
      "content": "<|vision_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151655": {
      "content": "<|image_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151656": {
      "content": "<|video_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151657": {
      "content": "<tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151658": {
      "content": "</tool_call>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151659": {
      "content": "<|fim_prefix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151660": {
      "content": "<|fim_middle|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151661": {
      "content": "<|fim_suffix|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151662": {
      "content": "<|fim_pad|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151663": {
      "content": "<|repo_name|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151664": {
      "content": "<|file_sep|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151665": {
      "content": "<tool_response>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151666": {
      "content": "</tool_response>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151667": {
      "content": "<think>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "151668": {
      "content": "</think>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    }
  },
  "additional_special_tokens": [
    "<|im_start|>",
    "<|im_end|>",
    "<|object_ref_start|>",
    "<|object_ref_end|>",
    "<|box_start|>",
    "<|box_end|>",
    "<|quad_start|>",
    "<|quad_end|>",
    "<|vision_start|>",
    "<|vision_end|>",
    "<|vision_pad|>",
    "<|image_pad|>",
    "<|video_pad|>"
  ],
  "bos_token": null,
  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0].role == 'system' %}\n        {{- messages[0].content + '\\n\\n' }}\n    {%- endif %}\n    {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0].role == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n    {%- set index = (messages|length - 1) - loop.index0 %}\n    {%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n        {%- set ns.multi_step_tool = false %}\n        {%- set ns.last_query_index = index %}\n    {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n    {%- if message.content is string %}\n        {%- set content = message.content %}\n    {%- else %}\n        {%- set content = '' %}\n    {%- endif %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n        {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {%- set reasoning_content = '' %}\n        {%- if message.reasoning_content is string %}\n            {%- set reasoning_content = message.reasoning_content %}\n        {%- else %}\n            {%- if '</think>' in content %}\n                {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n                {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n            {%- endif %}\n        {%- endif %}\n        {%- if loop.index0 > ns.last_query_index %}\n            {%- if loop.last or (not loop.last and reasoning_content) %}\n                {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n            {%- else %}\n                {{- '<|im_start|>' + message.role + '\\n' + content }}\n            {%- endif %}\n        {%- else %}\n            {{- '<|im_start|>' + message.role + '\\n' + content }}\n        {%- endif %}\n        {%- if message.tool_calls %}\n            {%- for tool_call in message.tool_calls %}\n                {%- if (loop.first and content) or (not loop.first) %}\n                    {{- '\\n' }}\n                {%- endif %}\n                {%- if tool_call.function %}\n                    {%- set tool_call = tool_call.function %}\n                {%- endif %}\n                {{- '<tool_call>\\n{\"name\": \"' }}\n                {{- tool_call.name }}\n                {{- '\", \"arguments\": ' }}\n                {%- if tool_call.arguments is string %}\n                    {{- tool_call.arguments }}\n                {%- else %}\n                    {{- tool_call.arguments | tojson }}\n                {%- endif %}\n                {{- '}\\n</tool_call>' }}\n            {%- endfor %}\n        {%- endif %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n    {%- if enable_thinking is defined and enable_thinking is false %}\n        {{- '<think>\\n\\n</think>\\n\\n' }}\n    {%- endif %}\n{%- endif %}",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "errors": "replace",
  "model_max_length": 131072,
  "pad_token": "<|endoftext|>",
  "split_special_tokens": false,
  "tokenizer_class": "Qwen2Tokenizer",
  "unk_token": null
 }
--- a/merged_model/vocab.json
+++ b/merged_model/vocab.json
--- a/ollama/Modelfile
+++ b/ollama/Modelfile
@@ -0,0 +1,9 @@
 FROM ../merged_model
 TEMPLATE """{{ if .System }}<|im_start|>system
 {{ .System }}<|im_end|>
 {{ end }}{{ if .Prompt }}<|im_start|>user
 {{ .Prompt }}<|im_end|>
 {{ end }}<|im_start|>assistant
 """
 PARAMETER temperature 0.1
 PARAMETER num_ctx 8192
--- a/ollama/create_ollama_model.ps1
+++ b/ollama/create_ollama_model.ps1
@@ -0,0 +1,21 @@
 param(
    [string]$ModelName = "lumynax-infused-qwen3-text-gguf"
 )
 $ErrorActionPreference = "Stop"
 Set-StrictMode -Version Latest
 $scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
 $modelfilePath = Join-Path $scriptDir "Modelfile"
 if (-not (Get-Command ollama -ErrorAction SilentlyContinue)) {
    throw "The `ollama` CLI is not installed. Install Ollama first."
 }
 & ollama create $ModelName -f $modelfilePath
 if ($LASTEXITCODE -ne 0) {
    exit $LASTEXITCODE
 }
 Write-Output "Created Ollama model: $ModelName"
 Write-Output "Run it with: ollama run $ModelName"
--- a/quickstart.py
+++ b/quickstart.py
@@ -0,0 +1,449 @@
 from __future__ import annotations
 import argparse
 import os
 import shutil
 import subprocess
 import sys
 from pathlib import Path
 MODEL_TITLE = "LumynaX Infused Qwen3 Text GGUF"
 def _build_parser() -> argparse.ArgumentParser:
    parser = argparse.ArgumentParser(description=f"Run a local GGUF chat for {MODEL_TITLE}.")
    parser.add_argument(
        "--prompt",
        default=None,
        help="Prompt to send to the model.",
    )
    parser.add_argument("--system-prompt", default="", help="Optional system prompt override.")
    parser.add_argument(
        "--interactive",
        action="store_true",
        help="Start an interactive terminal chat instead of running a single prompt.",
    )
    parser.add_argument("--max-new-tokens", type=int, default=192)
    parser.add_argument("--ctx-size", type=int, default=4096)
    parser.add_argument("--temperature", type=float, default=0.1)
    parser.add_argument("--threads", type=int, default=max(1, os.cpu_count() or 1))
    parser.add_argument("--llama-cli", default="", help="Optional explicit path to llama-cli.")
    parser.add_argument(
        "--cache-local",
        action="store_true",
        help="Copy the GGUF into LOCALAPPDATA before running. Useful when a runtime cannot read network paths.",
    )
    parser.add_argument("--reasoning", choices=("on", "off", "auto"), default="off")
    parser.add_argument(
        "--reasoning-format",
        choices=("auto", "none", "deepseek", "deepseek-legacy"),
        default="auto",
    )
    parser.add_argument("--reasoning-budget", type=int, default=None)
    return parser
 def _preferred_gguf(root: Path) -> Path:
    gguf_candidates = sorted(root.glob("*.gguf"))
    if not gguf_candidates:
        raise SystemExit(f"No GGUF file was found in {root}")
    for path in gguf_candidates:
        if "-q" in path.stem.lower():
            return path
    return gguf_candidates[0]
 def _local_model_path(model_path: Path, *, cache_local: bool = False) -> Path:
    if not cache_local:
        return model_path
    local_app_data = Path(os.environ.get("LOCALAPPDATA", Path.home() / "AppData" / "Local"))
    cache_dir = local_app_data / "tinyluminax" / "gguf-cache"
    cache_dir.mkdir(parents=True, exist_ok=True)
    cached_path = cache_dir / model_path.name
    source_stat = model_path.stat()
    if (
        not cached_path.exists()
        or cached_path.stat().st_size != source_stat.st_size
        or cached_path.stat().st_mtime_ns < source_stat.st_mtime_ns
    ):
        print(f"Caching GGUF locally at {cached_path}", file=sys.stderr)
        shutil.copy2(model_path, cached_path)
    return cached_path
 def _discover_llama_cli(explicit_path: str) -> Path | None:
    candidates: list[Path] = []
    if explicit_path.strip():
        candidates.append(Path(explicit_path.strip()))
    for env_var in ("LLAMA_CPP_CLI", "LLAMA_CLI_PATH"):
        raw_value = os.environ.get(env_var, "").strip()
        if raw_value:
            candidates.append(Path(raw_value))
    for binary_name in ("llama-cli", "llama-cli.exe"):
        resolved = shutil.which(binary_name)
        if resolved:
            candidates.append(Path(resolved))
    for candidate in candidates:
        if candidate.exists():
            return candidate
    return None
 def _extract_text(response: dict[str, object]) -> str:
    choices = response.get("choices", [])
    if not isinstance(choices, list) or not choices:
        raise RuntimeError("The runtime returned no choices.")
    first_choice = choices[0]
    if isinstance(first_choice, dict):
        message = first_choice.get("message")
        if isinstance(message, dict):
            content = message.get("content")
            if content not in (None, ""):
                return str(content).strip()
        text = first_choice.get("text")
        if text not in (None, ""):
            return str(text).strip()
    raise RuntimeError("The runtime returned an unsupported response payload.")
 def _run_llama_cpp_python(
    *,
    model_path: Path,
    system_prompt: str,
    user_prompt: str,
    max_new_tokens: int,
    ctx_size: int,
    temperature: float,
    threads: int,
 ) -> str:
    from llama_cpp import Llama
    llm = Llama(
        model_path=str(model_path),
        n_ctx=ctx_size,
        n_threads=threads,
        n_gpu_layers=0,
        chat_format="chat_template.default",
        verbose=False,
    )
    response = llm.create_chat_completion(
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt},
        ],
        max_tokens=max_new_tokens,
        temperature=temperature,
    )
    return _extract_text(response)
 def _run_llama_cli(
    *,
    llama_cli_path: Path,
    model_path: Path,
    system_prompt: str,
    user_prompt: str,
    max_new_tokens: int,
    ctx_size: int,
    temperature: float,
    threads: int,
    reasoning: str,
    reasoning_format: str,
    reasoning_budget: int | None,
 ) -> None:
    command = [
        str(llama_cli_path),
        "-m",
        str(model_path),
        "-sys",
        system_prompt,
        "-p",
        user_prompt,
        "-cnv",
        "-st",
        "-n",
        str(max_new_tokens),
        "-c",
        str(ctx_size),
        "--reasoning",
        reasoning,
        "--temp",
        str(temperature),
        "--threads",
        str(threads),
        "--no-display-prompt",
    ]
    if reasoning_format != "auto":
        command.extend(["--reasoning-format", reasoning_format])
    if reasoning_budget is not None:
        command.extend(["--reasoning-budget", str(reasoning_budget)])
    completed = subprocess.run(
        command,
        check=False,
        capture_output=True,
        text=True,
        encoding="utf-8",
    )
    if completed.returncode != 0:
        detail = completed.stderr.strip() or completed.stdout.strip() or "llama-cli failed"
        raise SystemExit(detail)
    stdout = completed.stdout.strip()
    if stdout:
        print(stdout)
 def _print_interactive_banner() -> None:
    print("LumynaX interactive terminal chat")
    print("Type /reset to clear the conversation, or /quit to exit.")
 def _run_interactive_llama_cpp_python(
    *,
    model_path: Path,
    system_prompt: str,
    max_new_tokens: int,
    ctx_size: int,
    temperature: float,
    threads: int,
    opening_prompt: str | None = None,
    reasoning: str = "off",
    reasoning_format: str = "auto",
    reasoning_budget: int | None = None,
 ) -> None:
    from llama_cpp import Llama
    llm = Llama(
        model_path=str(model_path),
        n_ctx=ctx_size,
        n_threads=threads,
        n_gpu_layers=0,
        chat_format="chat_template.default",
        verbose=False,
    )
    transcript: list[tuple[str, str]] = []
    _print_interactive_banner()
    pending_prompt = opening_prompt.strip() if opening_prompt and opening_prompt.strip() else None
    while True:
        try:
            if pending_prompt is None:
                user_prompt = input("You> ").strip()
            else:
                user_prompt = pending_prompt
                print(f"You> {user_prompt}")
                pending_prompt = None
        except (EOFError, KeyboardInterrupt):
            print("\nExiting LumynaX chat.")
            return
        if not user_prompt:
            continue
        lowered_prompt = user_prompt.lower()
        if lowered_prompt in ('/quit', '/exit'):
            print("Exiting LumynaX chat.")
            return
        if lowered_prompt == "/reset":
            transcript.clear()
            print("Conversation reset.")
            continue
        messages: list[dict[str, str]] = [{"role": "system", "content": system_prompt}]
        for transcript_user_prompt, transcript_assistant_response in transcript:
            messages.append({"role": "user", "content": transcript_user_prompt})
            messages.append({"role": "assistant", "content": transcript_assistant_response})
        messages.append({"role": "user", "content": user_prompt})
        response = llm.create_chat_completion(
            messages=messages,
            max_tokens=max_new_tokens,
            temperature=temperature,
        )
        assistant_text = _extract_text(response)
        print(f"LumynaX> {assistant_text}")
        transcript.append((user_prompt, assistant_text))
 def _run_interactive_llama_cli(
    *,
    llama_cli_path: Path,
    model_path: Path,
    system_prompt: str,
    max_new_tokens: int,
    ctx_size: int,
    temperature: float,
    threads: int,
    opening_prompt: str | None = None,
    reasoning: str = "off",
    reasoning_format: str = "auto",
    reasoning_budget: int | None = None,
 ) -> None:
    print("LumynaX interactive terminal chat")
    print("Interactive mode already uses llama-cli directly. Use Ctrl+C to exit.")
    command = [
        str(llama_cli_path),
        "-m",
        str(model_path),
        "-sys",
        system_prompt,
        "-cnv",
        "-n",
        str(max_new_tokens),
        "-c",
        str(ctx_size),
        "--reasoning",
        reasoning,
        "--temp",
        str(temperature),
        "--threads",
        str(threads),
        "--simple-io",
    ]
    if reasoning_format != "auto":
        command.extend(["--reasoning-format", reasoning_format])
    if reasoning_budget is not None:
        command.extend(["--reasoning-budget", str(reasoning_budget)])
    if opening_prompt and opening_prompt.strip():
        command.extend(["-p", opening_prompt.strip()])
    completed = subprocess.run(command, check=False)
    if completed.returncode != 0:
        raise SystemExit(completed.returncode)
 def main() -> None:
    args = _build_parser().parse_args()
    root = Path(__file__).resolve().parent
    source_model_path = _preferred_gguf(root)
    if hasattr(sys.stdout, "reconfigure"):
        sys.stdout.reconfigure(encoding="utf-8")
    single_prompt = (args.prompt or "Say hello in one short sentence.").strip()
    system_prompt = args.system_prompt.strip() or (
        f"You are LumynaX operating from the {MODEL_TITLE} package identity. "
        "Be helpful, clear, and honest about provenance."
    )
    explicit_cli_requested = bool(
        args.llama_cli.strip()
        or os.environ.get("LLAMA_CPP_CLI", "").strip()
        or os.environ.get("LLAMA_CLI_PATH", "").strip()
    )
    if args.interactive:
        llama_cli_path = _discover_llama_cli(args.llama_cli)
        if explicit_cli_requested:
            if llama_cli_path is None:
                raise SystemExit(
                    "A llama-cli override was requested, but no usable llama-cli binary was found.",
                )
            _run_interactive_llama_cli(
                llama_cli_path=llama_cli_path,
                model_path=_local_model_path(source_model_path, cache_local=args.cache_local),
                system_prompt=system_prompt,
                opening_prompt=args.prompt,
                max_new_tokens=args.max_new_tokens,
                ctx_size=args.ctx_size,
                temperature=args.temperature,
                threads=args.threads,
                reasoning=args.reasoning,
                reasoning_format=args.reasoning_format,
                reasoning_budget=args.reasoning_budget,
            )
            return
        model_path = _local_model_path(source_model_path, cache_local=args.cache_local)
        try:
            _run_interactive_llama_cpp_python(
                model_path=model_path,
                system_prompt=system_prompt,
                opening_prompt=args.prompt,
                max_new_tokens=args.max_new_tokens,
                ctx_size=args.ctx_size,
                temperature=args.temperature,
                threads=args.threads,
                reasoning=args.reasoning,
                reasoning_format=args.reasoning_format,
                reasoning_budget=args.reasoning_budget,
            )
            return
        except Exception as exc:  # noqa: BLE001
            if llama_cli_path is None:
                raise SystemExit(
                    "llama-cpp-python could not load this GGUF package. "
                    "Install or point LLAMA_CPP_CLI at llama-cli to use the built-in fallback. "
                    f"Original error: {exc}",
                ) from exc
            print(
                f"llama-cpp-python failed; falling back to llama-cli at {llama_cli_path}",
                file=sys.stderr,
            )
            _run_interactive_llama_cli(
                llama_cli_path=llama_cli_path,
                model_path=model_path,
                system_prompt=system_prompt,
                opening_prompt=args.prompt,
                max_new_tokens=args.max_new_tokens,
                ctx_size=args.ctx_size,
                temperature=args.temperature,
                threads=args.threads,
                reasoning=args.reasoning,
                reasoning_format=args.reasoning_format,
                reasoning_budget=args.reasoning_budget,
            )
            return
    if explicit_cli_requested:
        llama_cli_path = _discover_llama_cli(args.llama_cli)
        if llama_cli_path is None:
            raise SystemExit(
                "A llama-cli override was requested, but no usable llama-cli binary was found.",
            )
        _run_llama_cli(
            llama_cli_path=llama_cli_path,
            model_path=_local_model_path(source_model_path, cache_local=args.cache_local),
            system_prompt=system_prompt,
            user_prompt=single_prompt,
            max_new_tokens=args.max_new_tokens,
            ctx_size=args.ctx_size,
            temperature=args.temperature,
            threads=args.threads,
            reasoning=args.reasoning,
            reasoning_format=args.reasoning_format,
            reasoning_budget=args.reasoning_budget,
        )
        return
    model_path = _local_model_path(source_model_path, cache_local=args.cache_local)
    try:
        print(
            _run_llama_cpp_python(
                model_path=model_path,
                system_prompt=system_prompt,
                user_prompt=single_prompt,
                max_new_tokens=args.max_new_tokens,
                ctx_size=args.ctx_size,
                temperature=args.temperature,
                threads=args.threads,
            ),
        )
        return
    except Exception as exc:  # noqa: BLE001
        llama_cli_path = _discover_llama_cli(args.llama_cli)
        if llama_cli_path is None:
            raise SystemExit(
                "llama-cpp-python could not load this GGUF package. "
                "Install or point LLAMA_CPP_CLI at llama-cli to use the built-in fallback. "
                f"Original error: {exc}",
            ) from exc
        print(
            f"llama-cpp-python failed; falling back to llama-cli at {llama_cli_path}",
            file=sys.stderr,
        )
        _run_llama_cli(
            llama_cli_path=llama_cli_path,
            model_path=model_path,
            system_prompt=system_prompt,
            user_prompt=single_prompt,
            max_new_tokens=args.max_new_tokens,
            ctx_size=args.ctx_size,
            temperature=args.temperature,
            threads=args.threads,
            reasoning=args.reasoning,
            reasoning_format=args.reasoning_format,
            reasoning_budget=args.reasoning_budget,
        )
 if __name__ == "__main__":
    main()
--- a/release_export_manifest.json
+++ b/release_export_manifest.json
@@ -0,0 +1,95 @@
 {
  "artifacts": {
    "checksums": "checksums.sha256",
    "gguf": "lumynax-infused-qwen3-text-gguf-f16.gguf",
    "hf_space_app": "hf_space/app.py",
    "hf_space_dir": "hf_space",
    "hf_space_readme": "hf_space/README.md",
    "hf_space_requirements": "hf_space/requirements.txt",
    "license": "LICENSE.txt",
    "merged_model": null,
    "ollama_create_script": "ollama/create_ollama_model.ps1",
    "ollama_modelfile": "ollama/Modelfile",
    "quantized_gguf": "lumynax-infused-qwen3-text-gguf-q4_k_m.gguf",
    "quickstart": "quickstart.py",
    "readme": "README.md",
    "requirements": "requirements.txt",
    "training_summary": "artifacts/release_training_summary.json",
    "version": "VERSION.txt"
  },
  "capabilities": {
    "reasoning_enabled": false,
    "supported_modalities": [
      "text"
    ]
  },
  "delivery": "standalone_hf_text_gguf_release",
  "distribution": {
    "hf_space": {
      "app": "hf_space/app.py",
      "default_demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
      "default_model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
      "directory": "hf_space",
      "model_repo_env_var": "LUMYNAX_MODEL_REPO_ID",
      "paired_model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
      "readme": "hf_space/README.md",
      "requirements": "hf_space/requirements.txt",
      "status": "not_updated_for_text_gguf_release"
    },
    "ollama": {
      "create_script": "ollama/create_ollama_model.ps1",
      "modelfile": "ollama/Modelfile",
      "preferred_gguf": "lumynax-infused-qwen3-text-gguf-q4_k_m.gguf",
      "recommended_model_name": "lumynax-infused-qwen3-text-gguf",
      "status": "not_validated_for_huggingface_chat_template_gguf"
    }
  },
  "family": {
    "demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
    "family_name": "LumynaX",
    "lineage_position": "independent_release_line",
    "release_line_id": "lumynax-infused-qwen3",
    "release_wave": "wave1",
    "upstream_model_id": "Qwen/Qwen3-8B",
    "validation_status": "gguf_pending_validation"
  },
  "generated_at": "2026-04-19T00:22:26.476005+00:00",
  "gguf_path": "lumynax-infused-qwen3-text-gguf-f16.gguf",
  "manifest_version": 2,
  "merged_model_dir": null,
  "model_title": "LumynaX Infused Qwen3 Text GGUF",
  "package_state": "base_weights_hydrated_text_gguf",
  "public_identity": {
    "model_name": "LumynaX",
    "organization": "AbteeX AI Labs",
    "region": "Aotearoa New Zealand"
  },
  "quantized_gguf_path": "lumynax-infused-qwen3-text-gguf-q4_k_m.gguf",
  "release_line": {
    "default_model_name": "lumynax-infused-qwen3-text-gguf",
    "demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
    "model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
    "model_title": "LumynaX Infused Qwen3 Text GGUF",
    "output_dir_name": "lumynax-infused-qwen3-text-gguf-v1",
    "packaging_mode": "text_gguf",
    "prompt_format": "huggingface_chat_template",
    "release_id": "lumynax-infused-qwen3",
    "release_version": "v1",
    "upstream_model_id": "Qwen/Qwen3-8B",
    "validation_status": "gguf_pending_validation",
    "wave": "wave1"
  },
  "release_version": "v1",
  "runtime": {
    "delivery_mode": "standalone_text_gguf",
    "preferred_backend": "llama_cpp",
    "prompt_format": "huggingface_chat_template",
    "quickstart_command": "python quickstart.py"
  },
  "upstream_model": {
    "kind": "official_base_weights",
    "lumynax_weight_adaptation_applied": false,
    "provider": "Hugging Face",
    "repo_id": "Qwen/Qwen3-8B"
  }
 }
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1 @@
 llama-cpp-python>=0.3.18
		`@@ -0,0 +1 @@`
							`You are LumynaX operating from the LumynaX Infused Qwen3 Text GGUF package identity. This package wraps the official Qwen/Qwen3-8B checkpoint inside a LumynaX-branded multimodal and reasoning runtime. Always identify yourself as LumynaX when asked who you are. Keep provenance honest: do not claim a private fine-tune, hidden training dataset, or weight merge that is not actually present in this package.`