初始化项目，由ModelHub XC社区提供模型

Model: AbteeXAILab/lumynax-infused-qwen3-text-gguf Source: Original Platform
2026-06-06 09:18:19 +08:00
commit ca89ce6998
34 changed files with 153601 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,38 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+lumynax-infused-qwen3-text-gguf-f16.gguf filter=lfs diff=lfs merge=lfs -text
+lumynax-infused-qwen3-text-gguf-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
+merged_model/tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/LICENSE.txt
+++ b/LICENSE.txt
@@ -0,0 +1,11 @@
+LumynaX Infused Qwen3 Text GGUF
+Copyright (c) AbteeX AI Labs. All rights reserved.
+
+This release is proprietary. No right to use, copy, modify, distribute, host,
+sublicense, reverse engineer, or create derivative releases is granted except
+under a separate written agreement from AbteeX AI Labs.
+
+This package may be used only by parties expressly authorized by AbteeX AI Labs.
+
+Third-party software or model components, if any, remain subject to their own
+licenses and obligations.
--- a/README.md
+++ b/README.md
@@ -0,0 +1,230 @@
+---
+license: apache-2.0
+library_name: llama.cpp
+pipeline_tag: text-generation
+language:
+- en
+- mi
+tags:
+- abteex-ai-labs
+- aotearoa
+- general
+- gguf
+- local-first
+- lumynax
+- new-zealand
+- qwen
+- sovereign-ai
+- text
+---
+
+# LumynaX Infused Qwen3 Text GGUF
+
+<!-- lumynax-public-release-card:v4 -->
+
+<p align="center">
+  <img src="docs/lumynax-release-overview.svg" alt="LumynaX Infused Qwen3 Text GGUF release overview" width="100%" />
+</p>
+
+<p align="center">
+  <strong>LumynaX model-infusion release by AbteeX AI Labs.</strong><br/>
+  Public, non-gated package with runnable local instructions, provenance metadata, checksums, and a release manifest.
+</p>
+
+<p align="center">
+  <a href="#quickstart">Quickstart</a> |
+  <a href="#model-profile">Model profile</a> |
+  <a href="#runtime-files">Runtime files</a> |
+  <a href="#provenance-and-license">Provenance</a> |
+  <a href="#validation-status">Validation</a> |
+  <a href="#limitations-and-responsible-use">Limitations</a>
+</p>
+
+![LumynaX: infusion release](https://img.shields.io/badge/LumynaX-infusion%20release-e08a2c) ![access: public and non-gated](https://img.shields.io/badge/access-public%20and%20non--gated-0a0a0b) ![runtime: llama cpp](https://img.shields.io/badge/runtime-llama%20cpp-726b62) ![format: GGUF](https://img.shields.io/badge/format-GGUF-9a5416) ![audit: pass](https://img.shields.io/badge/audit-pass-4d6b44) ![docs: v4](https://img.shields.io/badge/docs-v4-111827)
+
+## Executive Summary
+
+This repository is a complete LumynaX release package for `AbteeXAILab/lumynax-infused-qwen3-text-gguf`. It is intended to be downloaded as a whole repo, not as a single loose weight file: the model artifact, `quickstart.py`, `requirements.txt`, `release_export_manifest.json`, `checksums.sha256`, license notice, and optional Ollama or Space files are part of the same release contract.
+
+LumynaX-infused means the upstream artifact is presented through the LumynaX release layer: local-first runtime scaffolding, LumynaX assistant identity, inference-chain metadata, public documentation, integrity files, and Aotearoa New Zealand-oriented workflow positioning. The release manifest records this as a LumynaX packaging and inference-chain layer around the listed upstream artifact; it does not claim a private LumynaX weight merge.
+
+## AbteeX LumynaX Public Surface
+
+This card follows the AbteeX/LumynaX public-facing system used across the release family: warm paper background visuals, black editorial typography, amber proof markers, compact evidence tables, and plain-language runtime instructions. The goal is not decoration; it is operational clarity. A downloader should immediately understand what the package is, what files belong together, what runtime path is expected, what provenance is available, and what limits still apply.
+
+## Sovereignty And Run Contract
+
+| Field | Value |
+| --- | --- |
+| Public surface | AbteeX/LumynaX light editorial system: warm paper, black ink, amber status markers, and evidence-first tables. |
+| Sovereign intent | Package is documented for local-first use, explicit provenance, and controlled deployment near governed data. |
+| Runtime residency | `llama_cpp` runtime can be deployed by the user in their own approved environment. |
+| Model artifact | `lumynax-infused-qwen3-text-gguf-f16.gguf` must stay with manifest, checksums, quickstart, requirements, and license files. |
+| Modalities | `text` |
+| License discipline | `apache-2.0` metadata is surfaced so downstream users can check redistribution and usage terms. |
+| Audit expectation | Record repo id, artifact checksum, runtime command, prompt template, operator, and deployment environment for production use. |
+| Router readiness | Compatible with the LumynaX MaramaRoute registry pattern for sovereign model selection and fallback planning. |
+| Local serving | Preferred first path is llama.cpp or llama-cpp-python with checksum verification before launch. |
+
+## Quickstart
+
+```bash
+hf download AbteeXAILab/lumynax-infused-qwen3-text-gguf --local-dir lumynax-infused-qwen3-text-gguf
+cd lumynax-infused-qwen3-text-gguf
+pip install -r requirements.txt
+python quickstart.py
+```
+
+Direct llama.cpp smoke command:
+
+```bash
+llama-cli -m "lumynax-infused-qwen3-text-gguf-f16.gguf" -p "Who are you? Answer as LumynaX in two sentences." -n 160
+```
+
+Ollama path:
+
+```bash
+ollama create lumynax-infused-qwen3-text-gguf -f ollama/Modelfile
+ollama run lumynax-infused-qwen3-text-gguf
+```
+
+## Model Profile
+
+| Field | Value |
+| --- | --- |
+| Release | `LumynaX Infused Qwen3 Text GGUF` |
+| Repository | `AbteeXAILab/lumynax-infused-qwen3-text-gguf` |
+| Mode | `Local-first text generation package` |
+| Runtime | `llama_cpp` |
+| Prompt format | `huggingface_chat_template` |
+| Modalities | `text` |
+| Primary artifact | `lumynax-infused-qwen3-text-gguf-f16.gguf` |
+| Detected weight size | `35.20 GB` |
+| Package state | `base_weights_hydrated_text_gguf` |
+| Delivery | `standalone_hf_text_gguf_release` |
+| Upstream/base | `Qwen/Qwen3-8B` |
+| Upstream kind | `official_base_weights` |
+| Source GGUF | `not applicable` |
+| Quantization | `See manifest` |
+| License metadata | `apache-2.0` |
+| Refreshed | `2026-05-11` |
+
+## Runtime Path
+
+<p align="center">
+  <img src="docs/lumynax-runtime-flow.svg" alt="LumynaX Infused Qwen3 Text GGUF runtime flow" width="100%" />
+</p>
+
+## Capability Profile
+
+| Field | Value |
+| --- | --- |
+| Primary fit | Use this for local chat, drafting, summarization, governance notes, and repeatable offline-friendly inference. |
+| Operational style | Local-first package with explicit files, checksums, and reproducible quickstarts. |
+| Identity behavior | The assistant should identify as LumynaX while remaining clear about upstream provenance. |
+
+## Runtime Files
+
+| Component | Status | Path |
+| --- | --- | --- |
+| README.md | `present` | `README.md` |
+| Quickstart | `present` | `quickstart.py` |
+| Requirements | `present` | `requirements.txt` |
+| Manifest | `present` | `release_export_manifest.json` |
+| Checksums | `present` | `checksums.sha256` |
+| License | `present` | `LICENSE.txt` |
+| Ollama | `present` | `ollama/Modelfile` |
+| Space scaffold | `present` | `hf_space/app.py` |
+| Overview visual | `present` | `docs/lumynax-release-overview.svg` |
+| Runtime visual | `present` | `docs/lumynax-runtime-flow.svg` |
+
+## Model Artifacts
+
+| Artifact | Size |
+| --- | ---: |
+| `lumynax-infused-qwen3-text-gguf-f16.gguf` | 15.26 GB |
+| `lumynax-infused-qwen3-text-gguf-q4_k_m.gguf` | 4.68 GB |
+| `merged_model/model-00001-of-00005.safetensors` | 3.72 GB |
+| `merged_model/model-00002-of-00005.safetensors` | 3.72 GB |
+| `merged_model/model-00003-of-00005.safetensors` | 3.69 GB |
+| `merged_model/model-00004-of-00005.safetensors` | 2.97 GB |
+| `merged_model/model-00005-of-00005.safetensors` | 1.16 GB |
+
+## Prompting Contract
+
+The preferred first prompt is an identity and provenance check:
+
+```text
+Who are you? What files do I need to keep together to run this package locally?
+```
+
+Expected behavior: the assistant should identify as LumynaX, explain that this is a LumynaX model-infusion package, and keep upstream provenance visible. The default package system prompt is:
+
+```text
+See quickstart.py
+```
+
+## Validation Status
+
+| Field | Value |
+| --- | --- |
+| Runtime audit | `pass` |
+| Public access audit | `public and non-gated` |
+| Anonymous metadata access | `True` |
+| Anonymous file listing | `True` |
+| Quickstart syntax | `pass` |
+| Manifest references | `pass` |
+| Checksum references | `pass` |
+
+The audit confirms public access, required release files, manifest references, checksum references, weight artifact presence, and quickstart syntax. It does not guarantee that every laptop has enough RAM or VRAM for the largest packages.
+
+## Integrity Checks
+
+After download, compare the model artifact against `checksums.sha256`.
+
+```bash
+sha256sum "lumynax-infused-qwen3-text-gguf-f16.gguf"
+cat checksums.sha256
+```
+
+On Windows PowerShell:
+
+```powershell
+Get-FileHash -Algorithm SHA256 "lumynax-infused-qwen3-text-gguf-f16.gguf"
+Get-Content checksums.sha256
+```
+
+## Provenance And License
+
+- Publisher: AbteeX AI Labs.
+- Family: LumynaX model and inference-chain release family.
+- Upstream/base: `Qwen/Qwen3-8B`.
+- Source GGUF: `not applicable`.
+- License metadata: `apache-2.0`.
+- License link: `LICENSE.txt` and upstream model card.
+
+Respect the upstream model license and keep attribution files with redistributed copies. Do not present this package as privately trained or weight-merged unless the release manifest explicitly says that weight adaptation was applied.
+
+## Limitations And Responsible Use
+
+- Outputs can be incorrect, incomplete, or biased; validate important answers before use.
+- Larger GGUF, MoE, multimodal, and frontier packages may require substantial RAM, VRAM, disk space, and recent runtime builds.
+- For high-impact decisions, use human review and domain-specific evaluation.
+- For sensitive data, prefer local execution and keep operational logs under your own governance policy.
+- This card documents package readiness and access; it is not a benchmark claim.
+
+## Automation Notes
+
+Automation should read these files before launching:
+
+- `release_export_manifest.json`
+- `checksums.sha256`
+- `quickstart.py`
+- `requirements.txt`
+- `ollama/Modelfile` when present
+
+## Related LumynaX Demo
+
+Try the public browser demo:
+
+- https://huggingface.co/spaces/AbteeXAILab/lumynax-live-demo
--- a/VERSION.txt
+++ b/VERSION.txt
@@ -0,0 +1 @@
+v1
--- a/artifacts/release_training_summary.json
+++ b/artifacts/release_training_summary.json
@@ -0,0 +1,19 @@
+{
+  "demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
+  "generated_at": "2026-04-19T00:22:26.426711+00:00",
+  "gguf_outtype": "f16",
+  "lumynax_identity_hardcoded": true,
+  "lumynax_weight_adaptation_applied": false,
+  "model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
+  "model_title": "LumynaX Infused Qwen3 Text GGUF",
+  "package_state": "base_weights_hydrated_text_gguf",
+  "prompt_format": "huggingface_chat_template",
+  "quantization": "Q4_K_M",
+  "release_line_id": "lumynax-infused-qwen3",
+  "release_wave": "wave1",
+  "summary": "This local release hydrates the official upstream `Qwen/Qwen3-8B` checkpoint, then converts the package into a text-first GGUF path for local llama.cpp inference.",
+  "text_only_release": true,
+  "upstream_base_model_id": "Qwen/Qwen3-8B",
+  "upstream_model_id": "Qwen/Qwen3-8B",
+  "validation_status": "gguf_pending_validation"
+}
--- a/checksums.sha256
+++ b/checksums.sha256
@@ -0,0 +1,31 @@
+661bf9a59862d01b054c8b3b2c448cd34da3c6118f5725a28887604402d9539b  .gitattributes
+23505d26ba7733c0da978daf30873a315e5340893e555801ad51eafa38097c73  LICENSE.txt
+d4f80d82704f71a60a99de614d89b89d6d1537e66e3840072abae6f174dc1c91  README.md
+2dfede0e6610c473959c963b292fcec325452acba33fd1bba21110e04933df53  VERSION.txt
+9901e81136425f63ed96424f97fd9e173701917b812a6e288cd8b9412b0adaaf  artifacts/release_training_summary.json
+0c7ff1b29daf7a7e149067a2120c079b10af819c8db15a51800d179752298e70  hf_space/README.md
+c90a7db3284d249da76eca2acedd362fcb7db7a90b50652c2cc41d4471da6ce8  hf_space/app.py
+6461da9e460060b06a64fc4874591e7c148eefb9de26d2b1e4c17a4ff52a7f5f  hf_space/requirements.txt
+6e572288198731afa28fd26e4aeabb1c85e6a2e0aafd1bf2c411df3edcdb3ef8  lumynax-infused-qwen3-text-gguf-f16.gguf
+d8c24f495f8da8dc922e4bc6855962de8e2ad885e1610491027b35669b40b62d  lumynax-infused-qwen3-text-gguf-q4_k_m.gguf
+dc07097cc3320f281ea8e935cb5ff6e51fc0fc79a23273574e5ba922ff620c16  merged_model/LUMYNAX_PACKAGE_IDENTITY.txt
+f7c4eadfbbf522470667b797a3c89be2524832d2d599797248dc304fff447c30  merged_model/config.json
+2325da0f15bb848e018c5ae071b7943332e9f871d6b60e2ed22ca97d4cb993d2  merged_model/generation_config.json
+8831e4f1a044471340f7c0a83d7bd71306a5b867e95fd870f74d0c5308a904d5  merged_model/merges.txt
+31d6a825ae35f11fb85b195b4c42c146c051e446433125a215336abdf95cbf5f  merged_model/model-00001-of-00005.safetensors
+5991236cea6fe21f3d43cab0f0e84448734fbbe0789816202989f2ddc9d18282  merged_model/model-00002-of-00005.safetensors
+c5185c4794be2d8a9784d5753c9922db38df478ce11f9ed0b415b7304d896836  merged_model/model-00003-of-00005.safetensors
+b5ee7de71fbf17db3d5704e0c8f2bc7d005ca9e1d7ca2aeb19827b0cfcaa917a  merged_model/model-00004-of-00005.safetensors
+20c2d6366ab85c90786ccdd829cd2b9e7d30ef3b2ebbb998280e7e4014b542ff  merged_model/model-00005-of-00005.safetensors
+f9fdbcb91c23971c13ec5d5f2573d2349e8f61f2f049371ec699281748fdb1bc  merged_model/model.safetensors.index.json
+aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4  merged_model/tokenizer.json
+d5d09f07b48c3086c508b30d1c9114bd1189145b74e982a265350c923acd8101  merged_model/tokenizer_config.json
+ca10d7e9fb3ed18575dd1e277a2579c16d108e32f27439684afa0e10b1440910  merged_model/vocab.json
+585ef6d7643cb741da5253658e95b9950b6c304cf2cf235bb7a326fdaccb2b85  ollama/Modelfile
+65e8ac0168581cf550036345bdffbde5304d1ba65a84db02e8f1fe01c5cc210a  ollama/create_ollama_model.ps1
+5c633adae6ce5e8e9bca83e01db28345f3e8184805bd6e167e3fd20fb6a44295  quickstart.py
+3f5bd6df80c72b129567884f898a2b0b18e06b5d2d2a55d787097fa5edb961c9  release_export_manifest.json
+2a7ca962dd79646b8470b45ec926ade0a4eb01ebdd93452e6990d8997666e378  requirements.txt
+ec0039556efe71ea5fd1934bf52f771801cb1ad02f8abca1d81e21d6948fe1d8  docs/lumynax-release-map.svg
+a308b9f269720036754928871e9e818732283fcd48ab97d8803621e79d4509fc  docs/lumynax-release-overview.svg
+bcfe76f0c0e473552e518e9b908d40e7a87896382564951806811d86ea34c0a3  docs/lumynax-runtime-flow.svg
--- a/docs/lumynax-capability.svg
+++ b/docs/lumynax-capability.svg
@@ -0,0 +1,12 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 900 260" role="img" aria-label="LumynaX capability profile">
+  <rect width="900" height="260" fill="#fffefa"/>
+  <rect x="0" y="0" width="900" height="3" fill="#0a0a0b"/>
+  <text x="64" y="36" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.18em" fill="#9a5416">CAPABILITY PROFILE</text>
+  <text x="64" y="58" font-family="Georgia, Cambria, serif" font-size="18" font-weight="500" fill="#0a0a0b">Where this model spends its weight.</text>
+  <g transform="translate(64,70)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">QUALITY</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="360" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">3/5</text></g>
+  <g transform="translate(64,98)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">LIGHTWEIGHT</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="0" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">0/5</text></g>
+  <g transform="translate(64,126)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">SOVEREIGNTY</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="360" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">3/5</text></g>
+  <g transform="translate(64,154)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">TOOLS</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="120" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">1/5</text></g>
+  <g transform="translate(64,182)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">JSON MODE</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">5/5</text></g>
+  <g transform="translate(64,210)"><text x="0" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" letter-spacing="0.12em" fill="#9a5416">CONTEXT</text><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#f6f0e8" stroke="rgba(10,10,11,0.12)"/><rect x="160" y="2" width="600" height="16" rx="8" ry="8" fill="#e08a2c"/><text x="772" y="14" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="700" fill="#0a0a0b">5/5</text></g>
+</svg>
--- a/docs/lumynax-overview.svg
+++ b/docs/lumynax-overview.svg
@@ -0,0 +1,23 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1280 340" role="img" aria-label="LumynaX Infused Qwen3 Text GGUF release banner">
+  <defs>
+    <linearGradient id="paperGrad" x1="0" y1="0" x2="0" y2="1">
+      <stop offset="0%" stop-color="#fffefa"/>
+      <stop offset="100%" stop-color="#f6f0e8"/>
+    </linearGradient>
+  </defs>
+  <rect width="1280" height="340" fill="url(#paperGrad)"/>
+  <rect x="860" y="0" width="420" height="4" fill="#e08a2c"/>
+  <rect x="0" y="336" width="1280" height="4" fill="#0a0a0b"/>
+  <text x="64" y="56" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="13" font-weight="700" letter-spacing="0.22em" fill="#9a5416">ABTEEX AI LABS &#183; AOTEAROA NEW ZEALAND</text>
+  <text x="64" y="78" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" letter-spacing="0.16em" fill="#726b62">LUMYNAX RELEASE &#183; CARD V6</text>
+  <text x="64" y="170" font-family="Georgia, Cambria, &quot;Times New Roman&quot;, serif" font-size="56" font-weight="500" fill="#0a0a0b">LumynaX Infused Qwen3 Text GGUF</text>
+  <line x1="64" y1="196" x2="220" y2="196" stroke="#e08a2c" stroke-width="3"/>
+  <text x="64" y="226" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="13" fill="#726b62">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
+  <g transform="translate(64,262)"><rect width="110" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">FAMILY</text><text x="63" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">QWEN</text></g>
+  <g transform="translate(184,262)"><rect width="142" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">RUNTIME</text><text x="70" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">llama_cpp</text></g>
+  <g transform="translate(336,262)"><rect width="110" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">MODES</text><text x="56" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">TEXT</text></g>
+  <g transform="translate(456,262)"><rect width="149" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">QUANT</text><text x="56" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">See manifest</text></g>
+  <g transform="translate(615,262)"><rect width="149" height="34" rx="17" ry="17" fill="#fffefa" stroke="rgba(10,10,11,0.12)"/><text x="14" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" font-weight="700" letter-spacing="0.14em" fill="#9a5416">LICENSE</text><text x="70" y="22" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="11" font-weight="600" fill="#0a0a0b">apache-2.0</text></g>
+  <text x="1216" y="56" text-anchor="end" font-family="Georgia, Cambria, serif" font-size="18" font-style="italic" fill="#726b62">held in the light</text>
+  <text x="1216" y="80" text-anchor="end" font-family="ui-monospace, SFMono-Regular, Menlo, Consolas, monospace" font-size="10" letter-spacing="0.18em" fill="#9a5416">KO TE MARAMA TE TUAPAPA</text>
+</svg>
--- a/docs/lumynax-release-map.svg
+++ b/docs/lumynax-release-map.svg
@@ -0,0 +1,33 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="1180" height="470" viewBox="0 0 1180 470" role="img" aria-labelledby="title desc">
+<title id="title">LumynaX Infused Qwen3 Text GGUF LumynaX release map</title>
+<desc id="desc">Visual release map for AbteeXAILab/lumynax-infused-qwen3-text-gguf: upstream artifact, LumynaX release layer, packaged files, and runtime path.</desc>
+<style>
+.bg{fill:#0b1220}.panel{fill:#111827;stroke:#38bdf8;stroke-width:2}.panel2{fill:#10231d;stroke:#34d399;stroke-width:2}.title{fill:#f8fafc;font:700 28px Arial}.sub{fill:#cbd5e1;font:15px Arial}.label{fill:#93c5fd;font:700 16px Arial}.body{fill:#e2e8f0;font:14px Arial}.chip{fill:#172554;stroke:#60a5fa;stroke-width:1}.chiptext{fill:#dbeafe;font:13px Arial}.arrow{stroke:#94a3b8;stroke-width:3;marker-end:url(#arrowhead)}
+</style>
+<defs><marker id="arrowhead" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto"><polygon points="0 0, 10 3.5, 0 7" fill="#94a3b8"/></marker></defs>
+<rect class="bg" x="0" y="0" width="1180" height="470" rx="20"/>
+<text class="title" x="45" y="52">LumynaX Infused Qwen3 Text GGUF</text>
+<text class="sub" x="45" y="80">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
+<rect class="panel" x="45" y="165" width="245" height="120" rx="14"/>
+<text class="label" x="63" y="197">Upstream</text>
+<text class="body" x="63" y="227">Qwen/Qwen3-8B</text>
+<line class="arrow" x1="298" y1="225.0" x2="317" y2="225.0"/>
+<rect class="panel2" x="325" y="165" width="245" height="120" rx="14"/>
+<text class="label" x="343" y="197">LumynaX layer</text>
+<text class="body" x="343" y="227">Identity, inference chaining, docs</text>
+<line class="arrow" x1="578" y1="225.0" x2="597" y2="225.0"/>
+<rect class="panel" x="605" y="165" width="245" height="120" rx="14"/>
+<text class="label" x="623" y="197">Package</text>
+<text class="body" x="623" y="227">lumynax-infused-qwen3-text-gguf-f16.gguf</text>
+<line class="arrow" x1="858" y1="225.0" x2="877" y2="225.0"/>
+<rect class="panel" x="885" y="165" width="245" height="120" rx="14"/>
+<text class="label" x="903" y="197">Runtime</text>
+<text class="body" x="903" y="227">llama_cpp</text>
+<rect class="chip" x="45" y="335" width="190" height="38" rx="19"/>
+<text class="chiptext" x="61" y="359">modalities: text</text>
+<rect class="chip" x="253" y="335" width="220" height="38" rx="19"/>
+<text class="chiptext" x="269" y="359">license: see LICENSE.txt</text>
+<rect class="chip" x="491" y="335" width="332" height="38" rx="19"/>
+<text class="chiptext" x="507" y="359">state: base_weights_hydrated_text_gguf</text>
+<text class="sub" x="45" y="425">Download the full repo: README, runtime files, manifest, checksums, and model artifacts stay together.</text>
+</svg>
--- a/docs/lumynax-release-overview.svg
+++ b/docs/lumynax-release-overview.svg
@@ -0,0 +1,51 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="1280" height="560" viewBox="0 0 1280 560" role="img" aria-labelledby="title desc">
+<title id="title">LumynaX Infused Qwen3 Text GGUF professional LumynaX release overview</title>
+<desc id="desc">Professional release overview for AbteeXAILab/lumynax-infused-qwen3-text-gguf showing provenance, LumynaX infusion layer, package artifact, and runtime path.</desc>
+<defs>
+<filter id="shadow" x="-12%" y="-12%" width="124%" height="124%"><feDropShadow dx="0" dy="10" stdDeviation="12" flood-color="#0a0a0b" flood-opacity="0.08"/></filter>
+<marker id="arrowhead" markerWidth="12" markerHeight="8" refX="10" refY="4" orient="auto"><polygon points="0 0, 12 4, 0 8" fill="#e08a2c"/></marker>
+</defs>
+<style>
+.title{fill:#0a0a0b;font:500 38px Georgia,serif}.sub{fill:#726b62;font:16px Aptos,Segoe UI,Arial}.eyebrow{fill:#9a5416;font:700 13px ui-monospace,Consolas,monospace;letter-spacing:2px}.rule{stroke:#0a0a0b;stroke-opacity:.12}.accent{stroke:#e08a2c;stroke-width:4}.card{fill:#ffffff;stroke:#0a0a0b;stroke-opacity:.12;stroke-width:1.2;filter:url(#shadow)}.num{fill:#ffffff;font:700 17px Aptos,Segoe UI,Arial}.numBg{fill:#0a0a0b}.label{fill:#0a0a0b;font:700 18px Aptos,Segoe UI,Arial}.body{fill:#5f574e;font:14px Aptos,Segoe UI,Arial}.chip{fill:#f6f0e8;stroke:#0a0a0b;stroke-opacity:.12}.chiptext{fill:#0a0a0b;font:700 12px ui-monospace,Consolas,monospace;letter-spacing:.7px}.line{stroke:#e08a2c;stroke-width:3;marker-end:url(#arrowhead)}
+</style>
+<rect width="1280" height="560" rx="28" fill="#fffefa"/>
+<line class="accent" x1="852" y1="34" x2="1228" y2="34"/>
+<line class="rule" x1="52" y1="160" x2="1228" y2="160"/>
+<text class="eyebrow" x="52" y="58">ABTEEX AI LABS - LUMYNAX MODEL INFUSION RELEASE</text>
+<text class="title" x="52" y="102">LumynaX Infused Qwen3 Text GGUF</text>
+<text class="sub" x="52" y="134">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
+<rect class="card" x="52" y="205" width="270" height="155" rx="18"/>
+<circle class="numBg" cx="86" cy="241" r="18"/>
+<text class="num" x="80" y="247">1</text>
+<text class="label" x="114" y="247">Upstream</text>
+<text class="body" x="78" y="287">Qwen/Qwen3-8B</text>
+<line class="line" x1="331" y1="282.5" x2="344" y2="282.5"/>
+<rect class="card" x="356" y="205" width="270" height="155" rx="18"/>
+<circle class="numBg" cx="390" cy="241" r="18"/>
+<text class="num" x="384" y="247">2</text>
+<text class="label" x="418" y="247">LumynaX Infusion</text>
+<text class="body" x="382" y="287">Identity, runtime scaffold,</text>
+<text class="body" x="382" y="308">provenance, checksums</text>
+<line class="line" x1="635" y1="282.5" x2="648" y2="282.5"/>
+<rect class="card" x="660" y="205" width="270" height="155" rx="18"/>
+<circle class="numBg" cx="694" cy="241" r="18"/>
+<text class="num" x="688" y="247">3</text>
+<text class="label" x="722" y="247">Release Package</text>
+<text class="body" x="686" y="287">lumynax-infused-qwen3-text-gguf-f16.gguf</text>
+<line class="line" x1="939" y1="282.5" x2="952" y2="282.5"/>
+<rect class="card" x="964" y="205" width="270" height="155" rx="18"/>
+<circle class="numBg" cx="998" cy="241" r="18"/>
+<text class="num" x="992" y="247">4</text>
+<text class="label" x="1026" y="247">Runtime</text>
+<text class="body" x="990" y="287">llama_cpp</text>
+<rect class="chip" x="52" y="420" width="190" height="40" rx="20"/>
+<text class="chiptext" x="69" y="445">modalities: text</text>
+<rect class="chip" x="258" y="420" width="190" height="40" rx="20"/>
+<text class="chiptext" x="275" y="445">license: apache-2.0</text>
+<rect class="chip" x="464" y="420" width="338" height="40" rx="20"/>
+<text class="chiptext" x="481" y="445">state: base_weights_hydrated_text_gguf</text>
+<rect class="chip" x="818" y="420" width="190" height="40" rx="20"/>
+<text class="chiptext" x="835" y="445">audit: pass</text>
+<line class="rule" x1="52" y1="496" x2="1228" y2="496"/>
+<text class="sub" x="52" y="526">Download the complete repo so README, manifest, checksums, runtime files, and weights stay together.</text>
+</svg>
--- a/docs/lumynax-runtime-flow.svg
+++ b/docs/lumynax-runtime-flow.svg
@@ -0,0 +1,35 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="1280" height="430" viewBox="0 0 1280 430" role="img" aria-labelledby="title desc">
+<title id="title">LumynaX Infused Qwen3 Text GGUF runtime flow</title>
+<desc id="desc">Runtime flow for AbteeXAILab/lumynax-infused-qwen3-text-gguf from download to verification, quickstart, and serving.</desc>
+<defs>
+<filter id="shadow" x="-12%" y="-12%" width="124%" height="124%"><feDropShadow dx="0" dy="10" stdDeviation="12" flood-color="#0a0a0b" flood-opacity="0.08"/></filter>
+<marker id="arrowhead" markerWidth="12" markerHeight="8" refX="10" refY="4" orient="auto"><polygon points="0 0, 12 4, 0 8" fill="#e08a2c"/></marker>
+</defs>
+<style>
+.title{fill:#0a0a0b;font:500 34px Georgia,serif}.sub{fill:#726b62;font:16px Aptos,Segoe UI,Arial}.eyebrow{fill:#9a5416;font:700 13px ui-monospace,Consolas,monospace;letter-spacing:2px}.rule{stroke:#0a0a0b;stroke-opacity:.12}.accent{stroke:#e08a2c;stroke-width:4}.box{fill:#ffffff;stroke:#0a0a0b;stroke-opacity:.12;stroke-width:1.2;filter:url(#shadow)}.label{fill:#0a0a0b;font:700 19px Aptos,Segoe UI,Arial}.body{fill:#5f574e;font:14px Aptos,Segoe UI,Arial}.line{stroke:#e08a2c;stroke-width:3;marker-end:url(#arrowhead)}.artifact{fill:#9a5416;font:700 14px ui-monospace,Consolas,monospace}
+</style>
+<rect width="1280" height="430" rx="24" fill="#fffefa"/>
+<line class="accent" x1="856" y1="34" x2="1230" y2="34"/>
+<text class="eyebrow" x="50" y="48">LOCAL-FIRST RUNTIME FLOW</text>
+<text class="title" x="50" y="88">LumynaX Infused Qwen3 Text GGUF Runtime Path</text>
+<text class="sub" x="50" y="118">Primary artifact: lumynax-infused-qwen3-text-gguf-f16.gguf</text>
+<line class="rule" x1="50" y1="138" x2="1230" y2="138"/>
+<rect class="box" x="64" y="178" width="245" height="118" rx="16"/>
+<text class="label" x="86" y="220">Download</text>
+<text class="body" x="86" y="251">hf download</text>
+<text class="body" x="86" y="271">AbteeXAILab/lumynax-infused-qwen3-text-gguf</text>
+<line class="line" x1="324" y1="237.0" x2="349" y2="237.0"/>
+<rect class="box" x="367" y="178" width="245" height="118" rx="16"/>
+<text class="label" x="389" y="220">Verify</text>
+<text class="body" x="389" y="251">checksums.sha256</text>
+<line class="line" x1="627" y1="237.0" x2="652" y2="237.0"/>
+<rect class="box" x="670" y="178" width="245" height="118" rx="16"/>
+<text class="label" x="692" y="220">Run</text>
+<text class="body" x="692" y="251">quickstart.py</text>
+<line class="line" x1="930" y1="237.0" x2="955" y2="237.0"/>
+<rect class="box" x="973" y="178" width="245" height="118" rx="16"/>
+<text class="label" x="995" y="220">Serve</text>
+<text class="body" x="995" y="251">llama.cpp / Ollama</text>
+<line class="rule" x1="50" y1="340" x2="1230" y2="340"/>
+<text class="artifact" x="50" y="375">Recommended first test: ask "Who are you?" and confirm the package answers with LumynaX identity plus honest provenance.</text>
+</svg>
--- a/hf_space/README.md
+++ b/hf_space/README.md
@@ -0,0 +1,35 @@
+---
+title: LumynaX Infused Qwen3 Text GGUF Demo
+colorFrom: green
+colorTo: blue
+sdk: gradio
+app_file: app.py
+pinned: false
+short_description: Private LumynaX Gemma E4B demo.
+---
+
+# LumynaX Infused Qwen3 Text GGUF Demo
+
+Private demo for the `lumynax-infused-qwen3-text-gguf` release line.
+
+## Supported Demo Modes
+
+- text with reasoning toggle
+- image understanding from upload or URL
+- audio understanding / transcription from upload or URL
+
+## Private Deployment Notes
+
+- this Space is intended to stay private for now
+- the backing model repo should be `AbteeXAILab/lumynax-infused-qwen3-text-gguf`
+- if that model repo is private, set an `HF_TOKEN` Space secret with read access
+- on CPU-only Hugging Face hardware this Space automatically falls back to showcase mode instead of live inference
+- if GPU hardware is later attached, the same Space switches back to live multimodal inference
+- the package chat template already hardcodes the LumynaX identity inside `merged_model/chat_template.jinja`
+- live inference for this Gemma E4B package still requires GPU-backed Space hardware; `cpu-basic` is not sufficient
+
+## Important Provenance
+
+This demo is branded as `LumynaX Infused Qwen3 Text GGUF`, but it serves the official upstream
+`google/gemma-4-E4B-it` base weights packaged under the LumynaX release identity.
+It does not claim a private LumynaX fine-tune of the checkpoint.
--- a/hf_space/app.py
+++ b/hf_space/app.py
@@ -0,0 +1,395 @@
+from __future__ import annotations
+
+import json
+import os
+from pathlib import Path
+from threading import Lock
+
+import gradio as gr
+import torch
+from huggingface_hub import snapshot_download
+from transformers import AutoModelForMultimodalLM, AutoProcessor
+
+MODEL_TITLE = "LumynaX Infused Qwen3 Text GGUF"
+DEFAULT_MODEL_REPO_ID = "AbteeXAILab/lumynax-infused-qwen3-text-gguf"
+MODEL_REPO_ENV_VAR = "LUMYNAX_MODEL_REPO_ID"
+HF_TOKEN_ENV_VARS = ("HF_TOKEN", "HUGGING_FACE_HUB_TOKEN", "HUGGINGFACE_HUB_TOKEN")
+DEFAULT_IMAGE_URL = "https://raw.githubusercontent.com/google-gemma/cookbook/refs/heads/main/Demos/sample-data/GoldenGate.png"
+DEFAULT_AUDIO_URL = "https://raw.githubusercontent.com/google-gemma/cookbook/refs/heads/main/Demos/sample-data/journal1.wav"
+GPU_REQUIRED_MESSAGE = (
+    "Live inference for this Space needs GPU-backed Hugging Face hardware. "
+    "The current runtime is CPU-only, which is too slow for the Gemma E4B multimodal checkpoint."
+)
+SHOWCASE_MESSAGE = (
+    "This Space is running in showcase mode on CPU hardware. "
+    "The examples below were captured during package validation so people can still see how the model behaves. "
+    "If GPU hardware is attached later, this same Space will switch back to live inference automatically."
+)
+SHOWCASE_SAMPLES = {
+    "text": {
+        "prompt": "Who are you? Reply in one short sentence.",
+        "response": "I am LumynaX, operating from the LumynaX Infused Gemma E4B Model package.",
+        "parsed_output": {
+            "role": "assistant",
+            "content": "I am LumynaX, operating from the LumynaX Infused Gemma E4B Model package.",
+        },
+    },
+    "image": {
+        "prompt": "What is shown in this image? Reply in under 12 words.",
+        "response": "The iconic Golden Gate Bridge spans the water under a clear sky. I am LumynaX.",
+        "parsed_output": {
+            "role": "assistant",
+            "content": "The iconic Golden Gate Bridge spans the water under a clear sky. I am LumynaX.",
+        },
+    },
+    "audio": {
+        "prompt": "Transcribe the speech in one line only.",
+        "response": 'A local validation run transcribed the bundled sample audio and included: "My name is LumynaX."',
+        "parsed_output": {
+            "validation_summary": 'A local validation run transcribed the bundled sample audio and included: "My name is LumynaX."',
+        },
+    },
+    "reasoning": {
+        "prompt": "Explain what this package is in one short sentence.",
+        "response": "Reasoning mode was verified locally and returned a non-empty structured thinking field.",
+        "parsed_output": {
+            "validation_summary": "Reasoning mode was verified locally and returned a non-empty structured thinking field.",
+        },
+    },
+}
+
+_MODEL = None
+_PROCESSOR = None
+_LOAD_ERROR = None
+_LOAD_LOCK = Lock()
+
+
+def _resolve_hf_token() -> str | None:
+    for env_var in HF_TOKEN_ENV_VARS:
+        raw_value = os.environ.get(env_var, "").strip()
+        if raw_value:
+            return raw_value
+    return None
+
+
+def _has_supported_gpu_runtime() -> bool:
+    return bool(torch.cuda.is_available())
+
+
+def _load_runtime() -> tuple[object, object]:
+    global _MODEL, _PROCESSOR, _LOAD_ERROR
+
+    if _MODEL is not None and _PROCESSOR is not None:
+        return _MODEL, _PROCESSOR
+    if _LOAD_ERROR is not None:
+        raise RuntimeError(_LOAD_ERROR)
+
+    with _LOAD_LOCK:
+        if _MODEL is not None and _PROCESSOR is not None:
+            return _MODEL, _PROCESSOR
+        if _LOAD_ERROR is not None:
+            raise RuntimeError(_LOAD_ERROR)
+
+        try:
+            if not _has_supported_gpu_runtime():
+                raise RuntimeError(GPU_REQUIRED_MESSAGE)
+            repo_id = os.environ.get(MODEL_REPO_ENV_VAR, "").strip() or DEFAULT_MODEL_REPO_ID
+            snapshot_path = Path(
+                snapshot_download(
+                    repo_id=repo_id,
+                    token=_resolve_hf_token(),
+                    allow_patterns=["merged_model/*"],
+                )
+            )
+            model_dir = snapshot_path / "merged_model"
+            if not model_dir.exists():
+                raise FileNotFoundError(f"Expected merged_model/ in {snapshot_path} after downloading {repo_id}.")
+
+            processor = AutoProcessor.from_pretrained(str(model_dir))
+            model = AutoModelForMultimodalLM.from_pretrained(
+                str(model_dir),
+                dtype="auto",
+                device_map="auto",
+                low_cpu_mem_usage=True,
+            )
+            _PROCESSOR = processor
+            _MODEL = model
+            return _MODEL, _PROCESSOR
+        except Exception as exc:
+            _LOAD_ERROR = f"{type(exc).__name__}: {exc}"
+            raise
+
+
+def _resolve_media_reference(upload_value: str | None, url_value: str | None) -> str | None:
+    if isinstance(url_value, str) and url_value.strip():
+        return url_value.strip()
+    if isinstance(upload_value, str) and upload_value.strip():
+        return upload_value.strip()
+    return None
+
+
+def _extract_response_text(parsed: object) -> str:
+    if isinstance(parsed, dict):
+        content = parsed.get("content")
+        if isinstance(content, str) and content.strip():
+            return content.strip()
+    if isinstance(parsed, str):
+        return parsed.strip()
+    return json.dumps(parsed, indent=2, ensure_ascii=False, default=str)
+
+
+def _format_json(value: object) -> str:
+    return json.dumps(value, indent=2, ensure_ascii=False, default=str)
+
+
+def run_request(
+    *,
+    prompt: str,
+    thinking: bool,
+    max_new_tokens: int,
+    image_upload: str | None = None,
+    image_url: str = "",
+    audio_upload: str | None = None,
+    audio_url: str = "",
+) -> tuple[str, str]:
+    if not prompt.strip():
+        raise gr.Error("A prompt is required.")
+
+    if not _has_supported_gpu_runtime():
+        return GPU_REQUIRED_MESSAGE, _format_json({"error": GPU_REQUIRED_MESSAGE})
+
+    image_ref = _resolve_media_reference(image_upload, image_url)
+    audio_ref = _resolve_media_reference(audio_upload, audio_url)
+    content: list[dict[str, str]] = []
+    if image_ref:
+        content.append({"type": "image", "url": image_ref})
+    if audio_ref:
+        content.append({"type": "audio", "audio": audio_ref})
+    content.append({"type": "text", "text": prompt.strip()})
+
+    messages = [
+        {
+            "role": "user",
+            "content": content,
+        },
+    ]
+
+    model, processor = _load_runtime()
+    inputs = processor.apply_chat_template(
+        messages,
+        tokenize=True,
+        return_dict=True,
+        return_tensors="pt",
+        add_generation_prompt=True,
+        enable_thinking=thinking,
+    ).to(model.device)
+    input_len = inputs["input_ids"].shape[-1]
+
+    with torch.inference_mode():
+        outputs = model.generate(
+            **inputs,
+            max_new_tokens=int(max_new_tokens),
+            do_sample=False,
+        )
+
+    response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
+    parsed = processor.parse_response(response) if hasattr(processor, "parse_response") else response
+    return _extract_response_text(parsed), _format_json(parsed)
+
+
+def run_text(prompt: str, thinking: bool, max_new_tokens: int) -> tuple[str, str]:
+    return run_request(
+        prompt=prompt,
+        thinking=thinking,
+        max_new_tokens=max_new_tokens,
+    )
+
+
+def run_image(
+    prompt: str,
+    image_upload: str | None,
+    image_url: str,
+    thinking: bool,
+    max_new_tokens: int,
+) -> tuple[str, str]:
+    return run_request(
+        prompt=prompt,
+        thinking=thinking,
+        max_new_tokens=max_new_tokens,
+        image_upload=image_upload,
+        image_url=image_url,
+    )
+
+
+def run_audio(
+    prompt: str,
+    audio_upload: str | None,
+    audio_url: str,
+    thinking: bool,
+    max_new_tokens: int,
+) -> tuple[str, str]:
+    return run_request(
+        prompt=prompt,
+        thinking=thinking,
+        max_new_tokens=max_new_tokens,
+        audio_upload=audio_upload,
+        audio_url=audio_url,
+    )
+
+
+def _render_showcase_sample(
+    *,
+    prompt: str,
+    response: str,
+    parsed_output: object,
+    media_markdown: str | None = None,
+    media_url: str | None = None,
+) -> None:
+    if media_markdown:
+        gr.Markdown(media_markdown)
+    if media_url:
+        gr.Textbox(label="Sample Asset URL", value=media_url, interactive=False, lines=1)
+    gr.Textbox(label="Example Prompt", value=prompt, interactive=False, lines=3)
+    gr.Textbox(label="Example Response", value=response, interactive=False, lines=6)
+    gr.Code(label="Example Parsed Output", value=_format_json(parsed_output), language="json")
+
+
+def _build_live_ui() -> None:
+    gr.Markdown(
+        f"# {MODEL_TITLE}\n\n"
+        "Live multimodal demo mode is active because GPU hardware is available. "
+        "The LumynaX identity comes from the packaged model template and is not user-editable here."
+    )
+    with gr.Tab("Text"):
+        text_prompt = gr.Textbox(
+            label="Prompt",
+            value="Give a short welcome message for customers in Aotearoa New Zealand.",
+            lines=4,
+        )
+        with gr.Row():
+            text_thinking = gr.Checkbox(label="Enable Reasoning", value=False)
+            text_max_tokens = gr.Slider(label="Max New Tokens", minimum=16, maximum=256, value=64, step=16)
+        text_run = gr.Button("Run Text Demo", variant="primary")
+        text_answer = gr.Textbox(label="Response", lines=8)
+        text_debug = gr.Code(label="Parsed Output", language="json")
+        text_run.click(
+            run_text,
+            inputs=[text_prompt, text_thinking, text_max_tokens],
+            outputs=[text_answer, text_debug],
+        )
+
+    with gr.Tab("Image"):
+        image_prompt = gr.Textbox(
+            label="Prompt",
+            value="What is shown in this image? Reply in under 12 words.",
+            lines=3,
+        )
+        image_upload = gr.Image(label="Upload Image", type="filepath")
+        image_url = gr.Textbox(label="Or Image URL", value=DEFAULT_IMAGE_URL)
+        with gr.Row():
+            image_thinking = gr.Checkbox(label="Enable Reasoning", value=False)
+            image_max_tokens = gr.Slider(label="Max New Tokens", minimum=16, maximum=256, value=64, step=16)
+        image_run = gr.Button("Run Image Demo", variant="primary")
+        image_answer = gr.Textbox(label="Response", lines=8)
+        image_debug = gr.Code(label="Parsed Output", language="json")
+        image_run.click(
+            run_image,
+            inputs=[image_prompt, image_upload, image_url, image_thinking, image_max_tokens],
+            outputs=[image_answer, image_debug],
+        )
+
+    with gr.Tab("Audio"):
+        audio_prompt = gr.Textbox(
+            label="Prompt",
+            value="Transcribe the speech in one line only.",
+            lines=3,
+        )
+        audio_upload = gr.Audio(label="Upload Audio", type="filepath")
+        audio_url = gr.Textbox(label="Or Audio URL", value=DEFAULT_AUDIO_URL)
+        with gr.Row():
+            audio_thinking = gr.Checkbox(label="Enable Reasoning", value=False)
+            audio_max_tokens = gr.Slider(label="Max New Tokens", minimum=16, maximum=256, value=64, step=16)
+        audio_run = gr.Button("Run Audio Demo", variant="primary")
+        audio_answer = gr.Textbox(label="Response", lines=8)
+        audio_debug = gr.Code(label="Parsed Output", language="json")
+        audio_run.click(
+            run_audio,
+            inputs=[audio_prompt, audio_upload, audio_url, audio_thinking, audio_max_tokens],
+            outputs=[audio_answer, audio_debug],
+        )
+
+
+def _build_showcase_ui() -> None:
+    gr.Markdown(
+        f"# {MODEL_TITLE}\n\n"
+        f"{SHOWCASE_MESSAGE}\n\n"
+        "This is still the real package identity and real package structure, but not live inference on this CPU-only Space."
+    )
+    with gr.Tab("Overview"):
+        gr.Markdown(
+            "### What this Space is showing\n"
+            "- verified text, image, audio, and reasoning examples from package validation\n"
+            "- the real packaged Gemma E4B release structure and LumynaX identity behavior\n"
+            "- honest provenance: packaged upstream Gemma weights under a LumynaX runtime identity\n\n"
+            "### Why this is showcase mode\n"
+            "- Hugging Face `cpu-basic` cannot serve this checkpoint interactively\n"
+            "- the same Space will switch to live inference automatically if GPU hardware is added later"
+        )
+    with gr.Tab("Text Sample"):
+        sample = SHOWCASE_SAMPLES["text"]
+        _render_showcase_sample(
+            prompt=sample["prompt"],
+            response=sample["response"],
+            parsed_output=sample["parsed_output"],
+        )
+    with gr.Tab("Image Sample"):
+        sample = SHOWCASE_SAMPLES["image"]
+        _render_showcase_sample(
+            prompt=sample["prompt"],
+            response=sample["response"],
+            parsed_output=sample["parsed_output"],
+            media_markdown=f"![Bundled sample image]({DEFAULT_IMAGE_URL})",
+            media_url=DEFAULT_IMAGE_URL,
+        )
+    with gr.Tab("Audio Sample"):
+        sample = SHOWCASE_SAMPLES["audio"]
+        _render_showcase_sample(
+            prompt=sample["prompt"],
+            response=sample["response"],
+            parsed_output=sample["parsed_output"],
+            media_url=DEFAULT_AUDIO_URL,
+        )
+    with gr.Tab("Reasoning Note"):
+        sample = SHOWCASE_SAMPLES["reasoning"]
+        _render_showcase_sample(
+            prompt=sample["prompt"],
+            response=sample["response"],
+            parsed_output=sample["parsed_output"],
+        )
+    with gr.Tab("Run It"):
+        gr.Markdown(
+            "### Local or GPU-backed run\n"
+            "Use the packaged files directly for a real interactive run, or attach GPU hardware to this Space."
+        )
+        gr.Textbox(
+            label="Quickstart",
+            interactive=False,
+            lines=4,
+            value=(
+                "pip install -r requirements.txt\n"
+                "python quickstart.py\n"
+                "python quickstart.py --mode image --image path-or-url\n"
+                "python quickstart.py --mode audio --audio path-or-url"
+            ),
+        )
+
+
+with gr.Blocks() as demo:
+    if _has_supported_gpu_runtime():
+        _build_live_ui()
+    else:
+        _build_showcase_ui()
+
+
+if __name__ == "__main__":
+    demo.queue().launch(show_error=True)
--- a/hf_space/requirements.txt
+++ b/hf_space/requirements.txt
@@ -0,0 +1,10 @@
+accelerate>=1.13
+gradio>=5.0
+huggingface-hub>=1.8
+librosa>=0.11
+numba>=0.65
+pillow>=10.0
+safetensors>=0.6
+torch>=2.9
+torchvision>=0.24
+transformers>=5.5.3
--- a/lumynax-infused-qwen3-text-gguf-f16.gguf
+++ b/lumynax-infused-qwen3-text-gguf-f16.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6e572288198731afa28fd26e4aeabb1c85e6a2e0aafd1bf2c411df3edcdb3ef8
+size 16388043648
--- a/lumynax-infused-qwen3-text-gguf-q4_k_m.gguf
+++ b/lumynax-infused-qwen3-text-gguf-q4_k_m.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d8c24f495f8da8dc922e4bc6855962de8e2ad885e1610491027b35669b40b62d
+size 5027783552
--- a/merged_model/LUMYNAX_PACKAGE_IDENTITY.txt
+++ b/merged_model/LUMYNAX_PACKAGE_IDENTITY.txt
@@ -0,0 +1 @@
+You are LumynaX operating from the LumynaX Infused Qwen3 Text GGUF package identity. This package wraps the official Qwen/Qwen3-8B checkpoint inside a LumynaX-branded multimodal and reasoning runtime. Always identify yourself as LumynaX when asked who you are. Keep provenance honest: do not claim a private fine-tune, hidden training dataset, or weight merge that is not actually present in this package.
--- a/merged_model/config.json
+++ b/merged_model/config.json
@@ -0,0 +1,30 @@
+{
+  "architectures": [
+    "Qwen3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 151643,
+  "eos_token_id": 151645,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 4096,
+  "initializer_range": 0.02,
+  "intermediate_size": 12288,
+  "max_position_embeddings": 40960,
+  "max_window_layers": 36,
+  "model_type": "qwen3",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 36,
+  "num_key_value_heads": 8,
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 1000000,
+  "sliding_window": null,
+  "tie_word_embeddings": false,
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.51.0",
+  "use_cache": true,
+  "use_sliding_window": false,
+  "vocab_size": 151936
+}
--- a/merged_model/generation_config.json
+++ b/merged_model/generation_config.json
@@ -0,0 +1,13 @@
+{
+    "bos_token_id": 151643,
+    "do_sample": true,
+    "eos_token_id": [
+        151645,
+        151643
+    ],
+    "pad_token_id": 151643,
+    "temperature": 0.6,
+    "top_k": 20,
+    "top_p": 0.95,
+    "transformers_version": "4.51.0"
+}
--- a/merged_model/merges.txt
+++ b/merged_model/merges.txt
--- a/merged_model/model-00001-of-00005.safetensors
+++ b/merged_model/model-00001-of-00005.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:31d6a825ae35f11fb85b195b4c42c146c051e446433125a215336abdf95cbf5f
+size 3996250744
--- a/merged_model/model-00002-of-00005.safetensors
+++ b/merged_model/model-00002-of-00005.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5991236cea6fe21f3d43cab0f0e84448734fbbe0789816202989f2ddc9d18282
+size 3993160032
--- a/merged_model/model-00003-of-00005.safetensors
+++ b/merged_model/model-00003-of-00005.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c5185c4794be2d8a9784d5753c9922db38df478ce11f9ed0b415b7304d896836
+size 3959604768
--- a/merged_model/model-00004-of-00005.safetensors
+++ b/merged_model/model-00004-of-00005.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b5ee7de71fbf17db3d5704e0c8f2bc7d005ca9e1d7ca2aeb19827b0cfcaa917a
+size 3187841392
--- a/merged_model/model-00005-of-00005.safetensors
+++ b/merged_model/model-00005-of-00005.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:20c2d6366ab85c90786ccdd829cd2b9e7d30ef3b2ebbb998280e7e4014b542ff
+size 1244659840
--- a/merged_model/model.safetensors.index.json
+++ b/merged_model/model.safetensors.index.json
@@ -0,0 +1,406 @@
+{
+  "metadata": {
+    "total_size": 16381470720
+  },
+  "weight_map": {
+    "lm_head.weight": "model-00005-of-00005.safetensors",
+    "model.embed_tokens.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.input_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.input_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.10.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.17.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.17.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.17.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.17.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.17.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.18.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.18.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.19.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.2.input_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.20.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.20.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.input_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.mlp.down_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.27.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.mlp.up_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.27.self_attn.k_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.self_attn.q_norm.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00005.safetensors",
+    "model.layers.28.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.28.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.29.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.3.input_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.30.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.30.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.31.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.32.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.33.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.34.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.input_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.mlp.down_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.mlp.gate_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.mlp.up_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.post_attention_layernorm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.self_attn.k_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.self_attn.k_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.self_attn.o_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.self_attn.q_norm.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.self_attn.q_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.35.self_attn.v_proj.weight": "model-00004-of-00005.safetensors",
+    "model.layers.4.input_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.input_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.input_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.mlp.down_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.mlp.up_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.self_attn.k_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.self_attn.q_norm.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.7.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.7.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00005.safetensors",
+    "model.layers.8.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.8.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.input_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.mlp.down_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.mlp.gate_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.mlp.up_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.self_attn.k_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.self_attn.k_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.self_attn.q_norm.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.self_attn.q_proj.weight": "model-00002-of-00005.safetensors",
+    "model.layers.9.self_attn.v_proj.weight": "model-00002-of-00005.safetensors",
+    "model.norm.weight": "model-00004-of-00005.safetensors"
+  }
+}
--- a/merged_model/tokenizer.json
+++ b/merged_model/tokenizer.json
--- a/merged_model/tokenizer_config.json
+++ b/merged_model/tokenizer_config.json
@@ -0,0 +1,239 @@
+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151665": {
+      "content": "<tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151666": {
+      "content": "</tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151667": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151668": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0].role == 'system' %}\n        {{- messages[0].content + '\\n\\n' }}\n    {%- endif %}\n    {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0].role == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n    {%- set index = (messages|length - 1) - loop.index0 %}\n    {%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n        {%- set ns.multi_step_tool = false %}\n        {%- set ns.last_query_index = index %}\n    {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n    {%- if message.content is string %}\n        {%- set content = message.content %}\n    {%- else %}\n        {%- set content = '' %}\n    {%- endif %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n        {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {%- set reasoning_content = '' %}\n        {%- if message.reasoning_content is string %}\n            {%- set reasoning_content = message.reasoning_content %}\n        {%- else %}\n            {%- if '</think>' in content %}\n                {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n                {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n            {%- endif %}\n        {%- endif %}\n        {%- if loop.index0 > ns.last_query_index %}\n            {%- if loop.last or (not loop.last and reasoning_content) %}\n                {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n            {%- else %}\n                {{- '<|im_start|>' + message.role + '\\n' + content }}\n            {%- endif %}\n        {%- else %}\n            {{- '<|im_start|>' + message.role + '\\n' + content }}\n        {%- endif %}\n        {%- if message.tool_calls %}\n            {%- for tool_call in message.tool_calls %}\n                {%- if (loop.first and content) or (not loop.first) %}\n                    {{- '\\n' }}\n                {%- endif %}\n                {%- if tool_call.function %}\n                    {%- set tool_call = tool_call.function %}\n                {%- endif %}\n                {{- '<tool_call>\\n{\"name\": \"' }}\n                {{- tool_call.name }}\n                {{- '\", \"arguments\": ' }}\n                {%- if tool_call.arguments is string %}\n                    {{- tool_call.arguments }}\n                {%- else %}\n                    {{- tool_call.arguments | tojson }}\n                {%- endif %}\n                {{- '}\\n</tool_call>' }}\n            {%- endfor %}\n        {%- endif %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n    {%- if enable_thinking is defined and enable_thinking is false %}\n        {{- '<think>\\n\\n</think>\\n\\n' }}\n    {%- endif %}\n{%- endif %}",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "model_max_length": 131072,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}
--- a/merged_model/vocab.json
+++ b/merged_model/vocab.json
--- a/ollama/Modelfile
+++ b/ollama/Modelfile
@@ -0,0 +1,9 @@
+FROM ../merged_model
+TEMPLATE """{{ if .System }}<|im_start|>system
+{{ .System }}<|im_end|>
+{{ end }}{{ if .Prompt }}<|im_start|>user
+{{ .Prompt }}<|im_end|>
+{{ end }}<|im_start|>assistant
+"""
+PARAMETER temperature 0.1
+PARAMETER num_ctx 8192
--- a/ollama/create_ollama_model.ps1
+++ b/ollama/create_ollama_model.ps1
@@ -0,0 +1,21 @@
+param(
+    [string]$ModelName = "lumynax-infused-qwen3-text-gguf"
+)
+
+$ErrorActionPreference = "Stop"
+Set-StrictMode -Version Latest
+
+$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
+$modelfilePath = Join-Path $scriptDir "Modelfile"
+
+if (-not (Get-Command ollama -ErrorAction SilentlyContinue)) {
+    throw "The `ollama` CLI is not installed. Install Ollama first."
+}
+
+& ollama create $ModelName -f $modelfilePath
+if ($LASTEXITCODE -ne 0) {
+    exit $LASTEXITCODE
+}
+
+Write-Output "Created Ollama model: $ModelName"
+Write-Output "Run it with: ollama run $ModelName"
--- a/quickstart.py
+++ b/quickstart.py
@@ -0,0 +1,449 @@
+from __future__ import annotations
+
+import argparse
+import os
+import shutil
+import subprocess
+import sys
+from pathlib import Path
+
+MODEL_TITLE = "LumynaX Infused Qwen3 Text GGUF"
+
+
+def _build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(description=f"Run a local GGUF chat for {MODEL_TITLE}.")
+    parser.add_argument(
+        "--prompt",
+        default=None,
+        help="Prompt to send to the model.",
+    )
+    parser.add_argument("--system-prompt", default="", help="Optional system prompt override.")
+    parser.add_argument(
+        "--interactive",
+        action="store_true",
+        help="Start an interactive terminal chat instead of running a single prompt.",
+    )
+    parser.add_argument("--max-new-tokens", type=int, default=192)
+    parser.add_argument("--ctx-size", type=int, default=4096)
+    parser.add_argument("--temperature", type=float, default=0.1)
+    parser.add_argument("--threads", type=int, default=max(1, os.cpu_count() or 1))
+    parser.add_argument("--llama-cli", default="", help="Optional explicit path to llama-cli.")
+    parser.add_argument(
+        "--cache-local",
+        action="store_true",
+        help="Copy the GGUF into LOCALAPPDATA before running. Useful when a runtime cannot read network paths.",
+    )
+    parser.add_argument("--reasoning", choices=("on", "off", "auto"), default="off")
+    parser.add_argument(
+        "--reasoning-format",
+        choices=("auto", "none", "deepseek", "deepseek-legacy"),
+        default="auto",
+    )
+    parser.add_argument("--reasoning-budget", type=int, default=None)
+    return parser
+
+
+def _preferred_gguf(root: Path) -> Path:
+    gguf_candidates = sorted(root.glob("*.gguf"))
+    if not gguf_candidates:
+        raise SystemExit(f"No GGUF file was found in {root}")
+    for path in gguf_candidates:
+        if "-q" in path.stem.lower():
+            return path
+    return gguf_candidates[0]
+
+
+def _local_model_path(model_path: Path, *, cache_local: bool = False) -> Path:
+    if not cache_local:
+        return model_path
+    local_app_data = Path(os.environ.get("LOCALAPPDATA", Path.home() / "AppData" / "Local"))
+    cache_dir = local_app_data / "tinyluminax" / "gguf-cache"
+    cache_dir.mkdir(parents=True, exist_ok=True)
+    cached_path = cache_dir / model_path.name
+    source_stat = model_path.stat()
+    if (
+        not cached_path.exists()
+        or cached_path.stat().st_size != source_stat.st_size
+        or cached_path.stat().st_mtime_ns < source_stat.st_mtime_ns
+    ):
+        print(f"Caching GGUF locally at {cached_path}", file=sys.stderr)
+        shutil.copy2(model_path, cached_path)
+    return cached_path
+
+
+def _discover_llama_cli(explicit_path: str) -> Path | None:
+    candidates: list[Path] = []
+    if explicit_path.strip():
+        candidates.append(Path(explicit_path.strip()))
+    for env_var in ("LLAMA_CPP_CLI", "LLAMA_CLI_PATH"):
+        raw_value = os.environ.get(env_var, "").strip()
+        if raw_value:
+            candidates.append(Path(raw_value))
+    for binary_name in ("llama-cli", "llama-cli.exe"):
+        resolved = shutil.which(binary_name)
+        if resolved:
+            candidates.append(Path(resolved))
+    for candidate in candidates:
+        if candidate.exists():
+            return candidate
+    return None
+
+
+def _extract_text(response: dict[str, object]) -> str:
+    choices = response.get("choices", [])
+    if not isinstance(choices, list) or not choices:
+        raise RuntimeError("The runtime returned no choices.")
+    first_choice = choices[0]
+    if isinstance(first_choice, dict):
+        message = first_choice.get("message")
+        if isinstance(message, dict):
+            content = message.get("content")
+            if content not in (None, ""):
+                return str(content).strip()
+        text = first_choice.get("text")
+        if text not in (None, ""):
+            return str(text).strip()
+    raise RuntimeError("The runtime returned an unsupported response payload.")
+
+
+def _run_llama_cpp_python(
+    *,
+    model_path: Path,
+    system_prompt: str,
+    user_prompt: str,
+    max_new_tokens: int,
+    ctx_size: int,
+    temperature: float,
+    threads: int,
+) -> str:
+    from llama_cpp import Llama
+
+    llm = Llama(
+        model_path=str(model_path),
+        n_ctx=ctx_size,
+        n_threads=threads,
+        n_gpu_layers=0,
+        chat_format="chat_template.default",
+        verbose=False,
+    )
+    response = llm.create_chat_completion(
+        messages=[
+            {"role": "system", "content": system_prompt},
+            {"role": "user", "content": user_prompt},
+        ],
+        max_tokens=max_new_tokens,
+        temperature=temperature,
+    )
+    return _extract_text(response)
+
+
+def _run_llama_cli(
+    *,
+    llama_cli_path: Path,
+    model_path: Path,
+    system_prompt: str,
+    user_prompt: str,
+    max_new_tokens: int,
+    ctx_size: int,
+    temperature: float,
+    threads: int,
+    reasoning: str,
+    reasoning_format: str,
+    reasoning_budget: int | None,
+) -> None:
+    command = [
+        str(llama_cli_path),
+        "-m",
+        str(model_path),
+        "-sys",
+        system_prompt,
+        "-p",
+        user_prompt,
+        "-cnv",
+        "-st",
+        "-n",
+        str(max_new_tokens),
+        "-c",
+        str(ctx_size),
+        "--reasoning",
+        reasoning,
+        "--temp",
+        str(temperature),
+        "--threads",
+        str(threads),
+        "--no-display-prompt",
+    ]
+    if reasoning_format != "auto":
+        command.extend(["--reasoning-format", reasoning_format])
+    if reasoning_budget is not None:
+        command.extend(["--reasoning-budget", str(reasoning_budget)])
+    completed = subprocess.run(
+        command,
+        check=False,
+        capture_output=True,
+        text=True,
+        encoding="utf-8",
+    )
+    if completed.returncode != 0:
+        detail = completed.stderr.strip() or completed.stdout.strip() or "llama-cli failed"
+        raise SystemExit(detail)
+    stdout = completed.stdout.strip()
+    if stdout:
+        print(stdout)
+
+
+def _print_interactive_banner() -> None:
+    print("LumynaX interactive terminal chat")
+    print("Type /reset to clear the conversation, or /quit to exit.")
+
+
+def _run_interactive_llama_cpp_python(
+    *,
+    model_path: Path,
+    system_prompt: str,
+    max_new_tokens: int,
+    ctx_size: int,
+    temperature: float,
+    threads: int,
+    opening_prompt: str | None = None,
+    reasoning: str = "off",
+    reasoning_format: str = "auto",
+    reasoning_budget: int | None = None,
+) -> None:
+    from llama_cpp import Llama
+
+    llm = Llama(
+        model_path=str(model_path),
+        n_ctx=ctx_size,
+        n_threads=threads,
+        n_gpu_layers=0,
+        chat_format="chat_template.default",
+        verbose=False,
+    )
+    transcript: list[tuple[str, str]] = []
+    _print_interactive_banner()
+
+    pending_prompt = opening_prompt.strip() if opening_prompt and opening_prompt.strip() else None
+    while True:
+        try:
+            if pending_prompt is None:
+                user_prompt = input("You> ").strip()
+            else:
+                user_prompt = pending_prompt
+                print(f"You> {user_prompt}")
+                pending_prompt = None
+        except (EOFError, KeyboardInterrupt):
+            print("\nExiting LumynaX chat.")
+            return
+        if not user_prompt:
+            continue
+        lowered_prompt = user_prompt.lower()
+        if lowered_prompt in ('/quit', '/exit'):
+            print("Exiting LumynaX chat.")
+            return
+        if lowered_prompt == "/reset":
+            transcript.clear()
+            print("Conversation reset.")
+            continue
+        messages: list[dict[str, str]] = [{"role": "system", "content": system_prompt}]
+        for transcript_user_prompt, transcript_assistant_response in transcript:
+            messages.append({"role": "user", "content": transcript_user_prompt})
+            messages.append({"role": "assistant", "content": transcript_assistant_response})
+        messages.append({"role": "user", "content": user_prompt})
+        response = llm.create_chat_completion(
+            messages=messages,
+            max_tokens=max_new_tokens,
+            temperature=temperature,
+        )
+        assistant_text = _extract_text(response)
+        print(f"LumynaX> {assistant_text}")
+        transcript.append((user_prompt, assistant_text))
+
+
+def _run_interactive_llama_cli(
+    *,
+    llama_cli_path: Path,
+    model_path: Path,
+    system_prompt: str,
+    max_new_tokens: int,
+    ctx_size: int,
+    temperature: float,
+    threads: int,
+    opening_prompt: str | None = None,
+    reasoning: str = "off",
+    reasoning_format: str = "auto",
+    reasoning_budget: int | None = None,
+) -> None:
+    print("LumynaX interactive terminal chat")
+    print("Interactive mode already uses llama-cli directly. Use Ctrl+C to exit.")
+    command = [
+        str(llama_cli_path),
+        "-m",
+        str(model_path),
+        "-sys",
+        system_prompt,
+        "-cnv",
+        "-n",
+        str(max_new_tokens),
+        "-c",
+        str(ctx_size),
+        "--reasoning",
+        reasoning,
+        "--temp",
+        str(temperature),
+        "--threads",
+        str(threads),
+        "--simple-io",
+    ]
+    if reasoning_format != "auto":
+        command.extend(["--reasoning-format", reasoning_format])
+    if reasoning_budget is not None:
+        command.extend(["--reasoning-budget", str(reasoning_budget)])
+    if opening_prompt and opening_prompt.strip():
+        command.extend(["-p", opening_prompt.strip()])
+    completed = subprocess.run(command, check=False)
+    if completed.returncode != 0:
+        raise SystemExit(completed.returncode)
+
+
+def main() -> None:
+    args = _build_parser().parse_args()
+    root = Path(__file__).resolve().parent
+    source_model_path = _preferred_gguf(root)
+    if hasattr(sys.stdout, "reconfigure"):
+        sys.stdout.reconfigure(encoding="utf-8")
+
+    single_prompt = (args.prompt or "Say hello in one short sentence.").strip()
+    system_prompt = args.system_prompt.strip() or (
+        f"You are LumynaX operating from the {MODEL_TITLE} package identity. "
+        "Be helpful, clear, and honest about provenance."
+    )
+    explicit_cli_requested = bool(
+        args.llama_cli.strip()
+        or os.environ.get("LLAMA_CPP_CLI", "").strip()
+        or os.environ.get("LLAMA_CLI_PATH", "").strip()
+    )
+    if args.interactive:
+        llama_cli_path = _discover_llama_cli(args.llama_cli)
+        if explicit_cli_requested:
+            if llama_cli_path is None:
+                raise SystemExit(
+                    "A llama-cli override was requested, but no usable llama-cli binary was found.",
+                )
+            _run_interactive_llama_cli(
+                llama_cli_path=llama_cli_path,
+                model_path=_local_model_path(source_model_path, cache_local=args.cache_local),
+                system_prompt=system_prompt,
+                opening_prompt=args.prompt,
+                max_new_tokens=args.max_new_tokens,
+                ctx_size=args.ctx_size,
+                temperature=args.temperature,
+                threads=args.threads,
+                reasoning=args.reasoning,
+                reasoning_format=args.reasoning_format,
+                reasoning_budget=args.reasoning_budget,
+            )
+            return
+        model_path = _local_model_path(source_model_path, cache_local=args.cache_local)
+        try:
+            _run_interactive_llama_cpp_python(
+                model_path=model_path,
+                system_prompt=system_prompt,
+                opening_prompt=args.prompt,
+                max_new_tokens=args.max_new_tokens,
+                ctx_size=args.ctx_size,
+                temperature=args.temperature,
+                threads=args.threads,
+                reasoning=args.reasoning,
+                reasoning_format=args.reasoning_format,
+                reasoning_budget=args.reasoning_budget,
+            )
+            return
+        except Exception as exc:  # noqa: BLE001
+            if llama_cli_path is None:
+                raise SystemExit(
+                    "llama-cpp-python could not load this GGUF package. "
+                    "Install or point LLAMA_CPP_CLI at llama-cli to use the built-in fallback. "
+                    f"Original error: {exc}",
+                ) from exc
+            print(
+                f"llama-cpp-python failed; falling back to llama-cli at {llama_cli_path}",
+                file=sys.stderr,
+            )
+            _run_interactive_llama_cli(
+                llama_cli_path=llama_cli_path,
+                model_path=model_path,
+                system_prompt=system_prompt,
+                opening_prompt=args.prompt,
+                max_new_tokens=args.max_new_tokens,
+                ctx_size=args.ctx_size,
+                temperature=args.temperature,
+                threads=args.threads,
+                reasoning=args.reasoning,
+                reasoning_format=args.reasoning_format,
+                reasoning_budget=args.reasoning_budget,
+            )
+            return
+    if explicit_cli_requested:
+        llama_cli_path = _discover_llama_cli(args.llama_cli)
+        if llama_cli_path is None:
+            raise SystemExit(
+                "A llama-cli override was requested, but no usable llama-cli binary was found.",
+            )
+        _run_llama_cli(
+            llama_cli_path=llama_cli_path,
+            model_path=_local_model_path(source_model_path, cache_local=args.cache_local),
+            system_prompt=system_prompt,
+            user_prompt=single_prompt,
+            max_new_tokens=args.max_new_tokens,
+            ctx_size=args.ctx_size,
+            temperature=args.temperature,
+            threads=args.threads,
+            reasoning=args.reasoning,
+            reasoning_format=args.reasoning_format,
+            reasoning_budget=args.reasoning_budget,
+        )
+        return
+    model_path = _local_model_path(source_model_path, cache_local=args.cache_local)
+    try:
+        print(
+            _run_llama_cpp_python(
+                model_path=model_path,
+                system_prompt=system_prompt,
+                user_prompt=single_prompt,
+                max_new_tokens=args.max_new_tokens,
+                ctx_size=args.ctx_size,
+                temperature=args.temperature,
+                threads=args.threads,
+            ),
+        )
+        return
+    except Exception as exc:  # noqa: BLE001
+        llama_cli_path = _discover_llama_cli(args.llama_cli)
+        if llama_cli_path is None:
+            raise SystemExit(
+                "llama-cpp-python could not load this GGUF package. "
+                "Install or point LLAMA_CPP_CLI at llama-cli to use the built-in fallback. "
+                f"Original error: {exc}",
+            ) from exc
+        print(
+            f"llama-cpp-python failed; falling back to llama-cli at {llama_cli_path}",
+            file=sys.stderr,
+        )
+        _run_llama_cli(
+            llama_cli_path=llama_cli_path,
+            model_path=model_path,
+            system_prompt=system_prompt,
+            user_prompt=single_prompt,
+            max_new_tokens=args.max_new_tokens,
+            ctx_size=args.ctx_size,
+            temperature=args.temperature,
+            threads=args.threads,
+            reasoning=args.reasoning,
+            reasoning_format=args.reasoning_format,
+            reasoning_budget=args.reasoning_budget,
+        )
+
+
+if __name__ == "__main__":
+    main()
--- a/release_export_manifest.json
+++ b/release_export_manifest.json
@@ -0,0 +1,95 @@
+{
+  "artifacts": {
+    "checksums": "checksums.sha256",
+    "gguf": "lumynax-infused-qwen3-text-gguf-f16.gguf",
+    "hf_space_app": "hf_space/app.py",
+    "hf_space_dir": "hf_space",
+    "hf_space_readme": "hf_space/README.md",
+    "hf_space_requirements": "hf_space/requirements.txt",
+    "license": "LICENSE.txt",
+    "merged_model": null,
+    "ollama_create_script": "ollama/create_ollama_model.ps1",
+    "ollama_modelfile": "ollama/Modelfile",
+    "quantized_gguf": "lumynax-infused-qwen3-text-gguf-q4_k_m.gguf",
+    "quickstart": "quickstart.py",
+    "readme": "README.md",
+    "requirements": "requirements.txt",
+    "training_summary": "artifacts/release_training_summary.json",
+    "version": "VERSION.txt"
+  },
+  "capabilities": {
+    "reasoning_enabled": false,
+    "supported_modalities": [
+      "text"
+    ]
+  },
+  "delivery": "standalone_hf_text_gguf_release",
+  "distribution": {
+    "hf_space": {
+      "app": "hf_space/app.py",
+      "default_demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
+      "default_model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
+      "directory": "hf_space",
+      "model_repo_env_var": "LUMYNAX_MODEL_REPO_ID",
+      "paired_model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
+      "readme": "hf_space/README.md",
+      "requirements": "hf_space/requirements.txt",
+      "status": "not_updated_for_text_gguf_release"
+    },
+    "ollama": {
+      "create_script": "ollama/create_ollama_model.ps1",
+      "modelfile": "ollama/Modelfile",
+      "preferred_gguf": "lumynax-infused-qwen3-text-gguf-q4_k_m.gguf",
+      "recommended_model_name": "lumynax-infused-qwen3-text-gguf",
+      "status": "not_validated_for_huggingface_chat_template_gguf"
+    }
+  },
+  "family": {
+    "demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
+    "family_name": "LumynaX",
+    "lineage_position": "independent_release_line",
+    "release_line_id": "lumynax-infused-qwen3",
+    "release_wave": "wave1",
+    "upstream_model_id": "Qwen/Qwen3-8B",
+    "validation_status": "gguf_pending_validation"
+  },
+  "generated_at": "2026-04-19T00:22:26.476005+00:00",
+  "gguf_path": "lumynax-infused-qwen3-text-gguf-f16.gguf",
+  "manifest_version": 2,
+  "merged_model_dir": null,
+  "model_title": "LumynaX Infused Qwen3 Text GGUF",
+  "package_state": "base_weights_hydrated_text_gguf",
+  "public_identity": {
+    "model_name": "LumynaX",
+    "organization": "AbteeX AI Labs",
+    "region": "Aotearoa New Zealand"
+  },
+  "quantized_gguf_path": "lumynax-infused-qwen3-text-gguf-q4_k_m.gguf",
+  "release_line": {
+    "default_model_name": "lumynax-infused-qwen3-text-gguf",
+    "demo_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf-demo",
+    "model_repo_id": "AbteeXAILab/lumynax-infused-qwen3-text-gguf",
+    "model_title": "LumynaX Infused Qwen3 Text GGUF",
+    "output_dir_name": "lumynax-infused-qwen3-text-gguf-v1",
+    "packaging_mode": "text_gguf",
+    "prompt_format": "huggingface_chat_template",
+    "release_id": "lumynax-infused-qwen3",
+    "release_version": "v1",
+    "upstream_model_id": "Qwen/Qwen3-8B",
+    "validation_status": "gguf_pending_validation",
+    "wave": "wave1"
+  },
+  "release_version": "v1",
+  "runtime": {
+    "delivery_mode": "standalone_text_gguf",
+    "preferred_backend": "llama_cpp",
+    "prompt_format": "huggingface_chat_template",
+    "quickstart_command": "python quickstart.py"
+  },
+  "upstream_model": {
+    "kind": "official_base_weights",
+    "lumynax_weight_adaptation_applied": false,
+    "provider": "Hugging Face",
+    "repo_id": "Qwen/Qwen3-8B"
+  }
+}
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1 @@
+llama-cpp-python>=0.3.18
				`@@ -0,0 +1 @@`
				`You are LumynaX operating from the LumynaX Infused Qwen3 Text GGUF package identity. This package wraps the official Qwen/Qwen3-8B checkpoint inside a LumynaX-branded multimodal and reasoning runtime. Always identify yourself as LumynaX when asked who you are. Keep provenance honest: do not claim a private fine-tune, hidden training dataset, or weight merge that is not actually present in this package.`