From e5067ace4c44e101ec4b6184ef721ccc2b6ece79 Mon Sep 17 00:00:00 2001
From: ModelHub XC <noreply@modelhub.org.cn>
Date: Sun, 12 Apr 2026 14:14:57 +0800
Subject: [PATCH] =?UTF-8?q?=E5=88=9D=E5=A7=8B=E5=8C=96=E9=A1=B9=E7=9B=AE?=
 =?UTF-8?q?=EF=BC=8C=E7=94=B1ModelHub=20XC=E7=A4=BE=E5=8C=BA=E6=8F=90?=
 =?UTF-8?q?=E4=BE=9B=E6=A8=A1=E5=9E=8B?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Model: reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF
Source: Original Platform
---
 .gitattributes                                |  39 +++
 README.md                                     | 223 ++++++++++++++++++
 ....6b-distilled-30b-thinking-sft-Q4_K_M.gguf |   3 +
 ....6b-distilled-30b-thinking-sft-Q5_K_M.gguf |   3 +
 ...-0.6b-distilled-30b-thinking-sft-Q8_0.gguf |   3 +
 ...3-0.6b-distilled-30b-thinking-sft-f16.gguf |   3 +
 6 files changed, 274 insertions(+)
 create mode 100644 .gitattributes
 create mode 100644 README.md
 create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf
 create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf
 create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf
 create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf

diff --git a/.gitattributes b/.gitattributes
new file mode 100644
index 0000000..f1d9555
--- /dev/null
+++ b/.gitattributes
@@ -0,0 +1,39 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf filter=lfs diff=lfs merge=lfs -text
+qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..bed60db
--- /dev/null
+++ b/README.md
@@ -0,0 +1,223 @@
+---
+library_name: llama.cpp
+license: apache-2.0
+language:
+  - en
+base_model: reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT
+tags:
+  - gguf
+  - quantized
+  - distillation
+  - sft
+  - reasoning
+  - mathematics
+  - physics
+  - legal
+  - stem
+  - chain-of-thought
+  - edge
+  - mobile
+  - convergentintel
+  - knowledge-distillation
+pipeline_tag: text-generation
+---
+
+# Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT — GGUF
+
+GGUF quantizations of [reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) for local, mobile, and edge deployment via [llama.cpp](https://github.com/ggerganov/llama.cpp) and compatible runtimes.
+
+A 30B Thinking teacher compressed 50x into a model that fits on a smartwatch.
+
+## Available Quantizations
+
+| File | Quant | Size | Use Case |
+|---|---|---|---|
+| `qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf` | F16 | ~1.3 GB | Full precision reference |
+| `qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf` | Q8_0 | ~700 MB | Near-lossless, desktop/laptop |
+| `qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf` | Q5_K_M | ~500 MB | Balanced, mobile |
+| `qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf` | Q4_K_M | ~400 MB | Smallest, IoT/edge/smartwatch |
+
+**Recommended:** Q5_K_M for mobile, Q4_K_M for maximum compression.
+
+## About the Model
+
+Two-stage build:
+
+**Stage 1 — Thinking Teacher Distillation:** Qwen3-0.6B distilled from Qwen3-30B-A3B-Thinking on 6,122 STEM chain-of-thought samples. The Thinking variant teacher produces extended reasoning traces with higher-entropy distributions, transferring richer deliberation structure into the student. Proof-weighted cross-entropy (2.5x → 1.5x on derivation tokens) + KL divergence at T=2.0.
+
+**Stage 2 — Legal SFT:** Supervised fine-tuning on [Alignment-Lab-AI/Lawyer-Instruct](https://huggingface.co/datasets/Alignment-Lab-AI/Lawyer-Instruct) at conservative learning rate (5e-6) to layer legal reasoning on top of the STEM backbone without overwriting it.
+
+| Attribute | Value |
+|---|---|
+| **Base model** | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
+| **Teacher model** | [Qwen/Qwen3-30B-A3B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507) |
+| **Compression** | 50x parameters, ~75x with Q4_K_M |
+| **Developer** | Reaperdoesntrun / [Convergent Intelligence LLC](https://convergentintel.com): Research Division |
+
+## Usage
+
+### llama.cpp CLI
+
+```bash
+./llama-cli -m qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf \
+  -p "### Instruction:\nWhat is promissory estoppel?\n\n### Response:\n" \
+  -n 512 --temp 0.0
+```
+
+### llama.cpp Python
+
+```python
+from llama_cpp import Llama
+
+llm = Llama(model_path="qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf", n_ctx=1024)
+
+output = llm(
+    "### Instruction:\nProve that the square root of 2 is irrational.\n\n### Response:\n",
+    max_tokens=512,
+    temperature=0.0,
+)
+print(output["choices"][0]["text"])
+```
+
+### Ollama
+
+```bash
+echo 'FROM ./qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf' > Modelfile
+ollama create stem-legal-tiny -f Modelfile
+ollama run stem-legal-tiny "Explain the difference between a felony and a misdemeanor."
+```
+
+### LM Studio
+
+Download any GGUF file from this repo and load directly in [LM Studio](https://lmstudio.ai/).
+
+## Prompt Formats
+
+**STEM derivation (Stage 1):**
+
+```
+Solve the following problem carefully and show a rigorous derivation.
+
+Problem:
+[Your problem]
+
+Proof:
+```
+
+**Instruction-following (Stage 2):**
+
+```
+### Instruction:
+[Your question]
+
+### Response:
+```
+
+## Limitations
+
+0.6B is a hard capacity constraint. The model trades depth for deployability — it will make errors that larger models avoid. Multi-step proofs beyond ~8 steps degrade. Legal reasoning covers general concepts but lacks nuance. Always verify critical outputs. This is not a substitute for formal proof verification, licensed legal counsel, or professional analysis.
+
+## Source Model
+
+Full training methodology, hyperparameters, and the two-stage pipeline are documented in:
+
+**[reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT)**
+
+
+## Mathematical Foundations
+
+This is a GGUF-quantized variant. The mathematical foundations (Discrepancy Calculus, Topological Knowledge Distillation) are documented in the source model's card. The discrepancy operator $Df(x)$ and BV decomposition that inform the training pipeline are preserved through quantization — the structural boundaries detected by DISC during training are baked into the weights, not dependent on precision.
+
+## Related Models
+
+| Model | Description |
+|---|---|
+| [Qwen3-0.6B-STEM-Proof-Distilled-Thinking](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-STEM-Proof-Distilled-Thinking) | Stage 1 only — pure STEM backbone |
+| [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) | Full precision source model |
+| [Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF) | Larger 1.7B variant GGUF |
+
+## Citation
+
+```bibtex
+@misc{colca2026thinking06bgguf,
+  title={Qwen3-0.6B Distilled Thinking SFT: 50x Compression GGUF for Edge Deployment},
+  year={2026},
+  publisher={HuggingFace},
+  url={https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF},
+  note={Convergent Intelligence LLC: Research Division}
+}
+```
+
+---
+
+*Convergent Intelligence LLC: Research Division*
+*"Where classical analysis fails to see, we begin."*
+
+---
+
+## Convergent Intelligence Portfolio
+
+*Part of the [Qwen3 0.6B Distillation Series](https://huggingface.co/reaperdoesntknow) by [Convergent Intelligence LLC: Research Division](https://huggingface.co/reaperdoesntknow)*
+
+
+#
+## Mathematical Foundations
+
+This is a GGUF-quantized variant. The mathematical foundations (Discrepancy Calculus, Topological Knowledge Distillation) are documented in the source model's card. The discrepancy operator $Df(x)$ and BV decomposition that inform the training pipeline are preserved through quantization — the structural boundaries detected by DISC during training are baked into the weights, not dependent on precision.
+
+## Related Models
+
+| Model | Downloads | Format |
+|-------|-----------|--------|
+| [Qwen3-0.6B-Distilled-30B-A3B](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B) | 36 | HF |
+| [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) | 33 | HF |
+
+### Top Models from Our Lab
+
+| Model | Downloads |
+|-------|-----------|
+| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | 501 |
+| [LFM2.5-1.2B-Distilled-SFT](https://huggingface.co/reaperdoesntknow/LFM2.5-1.2B-Distilled-SFT) | 342 |
+| [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) | 302 |
+| [Qwen3-1.7B-Coder-Distilled-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF) | 194 |
+| [Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF) | 175 |
+
+**Total Portfolio: 41 models | 2,781 total downloads**
+
+
+*Last updated: 2026-03-28 12:49 UTC*
+
+<!-- DISTILQWEN-SPOTLIGHT-START -->
+
+## DistilQwen Collection
+
+This model is part of the **[DistilQwen](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** proof-weighted distillation series.
+Collection: **9 models** | **2,788 downloads**
+
+### Teacher Variant Comparison
+
+| Teacher | Student Size | Strength | Models |
+|---------|-------------|----------|--------|
+| Qwen3-30B-A3B (Instruct) | 1.7B | Instruction following, structured output, legal reasoning | 3 (833 DL) |
+| Qwen3-30B-A3B (Thinking) | 0.6B | Extended deliberation, higher-entropy distributions, proof derivation | 3 (779 DL) **← this model** |
+| Qwen3-30B-A3B (Coder) | 1.7B | Structured decomposition, STEM derivation, logical inference | 2 (825 DL) |
+
+### Methodology
+
+**The only BF16 collection in the portfolio.** While the broader Convergent Intelligence catalog (43 models, 12,000+ downloads) was trained on CPU at FP32 for $24 total compute, the DistilQwen series was trained on H100 at BF16 with a 30B-parameter teacher. Same methodology, premium hardware. This is what happens when you give the pipeline real compute.
+
+All models use proof-weighted knowledge distillation: 55% cross-entropy with decaying proof weights (2.5× → 1.5×), 45% KL divergence at T=2.0. The proof weight amplifies loss on reasoning-critical tokens, forcing the student to allocate capacity to structural understanding rather than surface-level pattern matching.
+
+Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)
+
+### Related in this series
+
+- [Qwen3-0.6B-Distilled-30B-A3B](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B) (236 downloads)
+- [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) (227 downloads)
+
+<!-- DISTILQWEN-SPOTLIGHT-END -->
+
+---
+<sub>Part of the [reaperdoesntknow research portfolio](https://huggingface.co/reaperdoesntknow) — 49 models, 22,598 total downloads | Last refreshed: 2026-03-30 12:05 UTC</sub>
+<!-- cix-keeper-ts:2026-04-11T16:09:09Z -->
+<!-- card-refresh: 2026-03-30 -->
diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf
new file mode 100644
index 0000000..b92cc6c
--- /dev/null
+++ b/qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:608e6f5781bfcb34e3d9f7fca9b513f5e5bb058ca8843241f1abe52a1839cf32
+size 484219808
diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf
new file mode 100644
index 0000000..663f800
--- /dev/null
+++ b/qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cb3cc87de939b9ad08a406ce7434cdcbc4bec01b4b8b31c9e625ed518d92233d
+size 551377824
diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf
new file mode 100644
index 0000000..f5fe13b
--- /dev/null
+++ b/qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:24c4a94ac8ebc1f8003f1d1f35c1de4d4f6e12727ef30d7f82dd89e3c9e398f6
+size 804753312
diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf
new file mode 100644
index 0000000..91f7726
--- /dev/null
+++ b/qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:530f4d26cda5f399d54a65903d122b832b74d7127a78468bed213d94385336de
+size 1509347232