From e5067ace4c44e101ec4b6184ef721ccc2b6ece79 Mon Sep 17 00:00:00 2001 From: ModelHub XC Date: Sun, 12 Apr 2026 14:14:57 +0800 Subject: [PATCH] =?UTF-8?q?=E5=88=9D=E5=A7=8B=E5=8C=96=E9=A1=B9=E7=9B=AE?= =?UTF-8?q?=EF=BC=8C=E7=94=B1ModelHub=20XC=E7=A4=BE=E5=8C=BA=E6=8F=90?= =?UTF-8?q?=E4=BE=9B=E6=A8=A1=E5=9E=8B?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Model: reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF Source: Original Platform --- .gitattributes | 39 +++ README.md | 223 ++++++++++++++++++ ....6b-distilled-30b-thinking-sft-Q4_K_M.gguf | 3 + ....6b-distilled-30b-thinking-sft-Q5_K_M.gguf | 3 + ...-0.6b-distilled-30b-thinking-sft-Q8_0.gguf | 3 + ...3-0.6b-distilled-30b-thinking-sft-f16.gguf | 3 + 6 files changed, 274 insertions(+) create mode 100644 .gitattributes create mode 100644 README.md create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf create mode 100644 qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..f1d9555 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,39 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf filter=lfs diff=lfs merge=lfs -text +qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text +qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text +qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/README.md b/README.md new file mode 100644 index 0000000..bed60db --- /dev/null +++ b/README.md @@ -0,0 +1,223 @@ +--- +library_name: llama.cpp +license: apache-2.0 +language: + - en +base_model: reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT +tags: + - gguf + - quantized + - distillation + - sft + - reasoning + - mathematics + - physics + - legal + - stem + - chain-of-thought + - edge + - mobile + - convergentintel + - knowledge-distillation +pipeline_tag: text-generation +--- + +# Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT — GGUF + +GGUF quantizations of [reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) for local, mobile, and edge deployment via [llama.cpp](https://github.com/ggerganov/llama.cpp) and compatible runtimes. + +A 30B Thinking teacher compressed 50x into a model that fits on a smartwatch. + +## Available Quantizations + +| File | Quant | Size | Use Case | +|---|---|---|---| +| `qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf` | F16 | ~1.3 GB | Full precision reference | +| `qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf` | Q8_0 | ~700 MB | Near-lossless, desktop/laptop | +| `qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf` | Q5_K_M | ~500 MB | Balanced, mobile | +| `qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf` | Q4_K_M | ~400 MB | Smallest, IoT/edge/smartwatch | + +**Recommended:** Q5_K_M for mobile, Q4_K_M for maximum compression. + +## About the Model + +Two-stage build: + +**Stage 1 — Thinking Teacher Distillation:** Qwen3-0.6B distilled from Qwen3-30B-A3B-Thinking on 6,122 STEM chain-of-thought samples. The Thinking variant teacher produces extended reasoning traces with higher-entropy distributions, transferring richer deliberation structure into the student. Proof-weighted cross-entropy (2.5x → 1.5x on derivation tokens) + KL divergence at T=2.0. + +**Stage 2 — Legal SFT:** Supervised fine-tuning on [Alignment-Lab-AI/Lawyer-Instruct](https://huggingface.co/datasets/Alignment-Lab-AI/Lawyer-Instruct) at conservative learning rate (5e-6) to layer legal reasoning on top of the STEM backbone without overwriting it. + +| Attribute | Value | +|---|---| +| **Base model** | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) | +| **Teacher model** | [Qwen/Qwen3-30B-A3B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507) | +| **Compression** | 50x parameters, ~75x with Q4_K_M | +| **Developer** | Reaperdoesntrun / [Convergent Intelligence LLC](https://convergentintel.com): Research Division | + +## Usage + +### llama.cpp CLI + +```bash +./llama-cli -m qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf \ + -p "### Instruction:\nWhat is promissory estoppel?\n\n### Response:\n" \ + -n 512 --temp 0.0 +``` + +### llama.cpp Python + +```python +from llama_cpp import Llama + +llm = Llama(model_path="qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf", n_ctx=1024) + +output = llm( + "### Instruction:\nProve that the square root of 2 is irrational.\n\n### Response:\n", + max_tokens=512, + temperature=0.0, +) +print(output["choices"][0]["text"]) +``` + +### Ollama + +```bash +echo 'FROM ./qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf' > Modelfile +ollama create stem-legal-tiny -f Modelfile +ollama run stem-legal-tiny "Explain the difference between a felony and a misdemeanor." +``` + +### LM Studio + +Download any GGUF file from this repo and load directly in [LM Studio](https://lmstudio.ai/). + +## Prompt Formats + +**STEM derivation (Stage 1):** + +``` +Solve the following problem carefully and show a rigorous derivation. + +Problem: +[Your problem] + +Proof: +``` + +**Instruction-following (Stage 2):** + +``` +### Instruction: +[Your question] + +### Response: +``` + +## Limitations + +0.6B is a hard capacity constraint. The model trades depth for deployability — it will make errors that larger models avoid. Multi-step proofs beyond ~8 steps degrade. Legal reasoning covers general concepts but lacks nuance. Always verify critical outputs. This is not a substitute for formal proof verification, licensed legal counsel, or professional analysis. + +## Source Model + +Full training methodology, hyperparameters, and the two-stage pipeline are documented in: + +**[reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT)** + + +## Mathematical Foundations + +This is a GGUF-quantized variant. The mathematical foundations (Discrepancy Calculus, Topological Knowledge Distillation) are documented in the source model's card. The discrepancy operator $Df(x)$ and BV decomposition that inform the training pipeline are preserved through quantization — the structural boundaries detected by DISC during training are baked into the weights, not dependent on precision. + +## Related Models + +| Model | Description | +|---|---| +| [Qwen3-0.6B-STEM-Proof-Distilled-Thinking](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-STEM-Proof-Distilled-Thinking) | Stage 1 only — pure STEM backbone | +| [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) | Full precision source model | +| [Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF) | Larger 1.7B variant GGUF | + +## Citation + +```bibtex +@misc{colca2026thinking06bgguf, + title={Qwen3-0.6B Distilled Thinking SFT: 50x Compression GGUF for Edge Deployment}, + year={2026}, + publisher={HuggingFace}, + url={https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT-GGUF}, + note={Convergent Intelligence LLC: Research Division} +} +``` + +--- + +*Convergent Intelligence LLC: Research Division* +*"Where classical analysis fails to see, we begin."* + +--- + +## Convergent Intelligence Portfolio + +*Part of the [Qwen3 0.6B Distillation Series](https://huggingface.co/reaperdoesntknow) by [Convergent Intelligence LLC: Research Division](https://huggingface.co/reaperdoesntknow)* + + +# +## Mathematical Foundations + +This is a GGUF-quantized variant. The mathematical foundations (Discrepancy Calculus, Topological Knowledge Distillation) are documented in the source model's card. The discrepancy operator $Df(x)$ and BV decomposition that inform the training pipeline are preserved through quantization — the structural boundaries detected by DISC during training are baked into the weights, not dependent on precision. + +## Related Models + +| Model | Downloads | Format | +|-------|-----------|--------| +| [Qwen3-0.6B-Distilled-30B-A3B](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B) | 36 | HF | +| [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) | 33 | HF | + +### Top Models from Our Lab + +| Model | Downloads | +|-------|-----------| +| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | 501 | +| [LFM2.5-1.2B-Distilled-SFT](https://huggingface.co/reaperdoesntknow/LFM2.5-1.2B-Distilled-SFT) | 342 | +| [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) | 302 | +| [Qwen3-1.7B-Coder-Distilled-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF) | 194 | +| [Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF) | 175 | + +**Total Portfolio: 41 models | 2,781 total downloads** + + +*Last updated: 2026-03-28 12:49 UTC* + + + +## DistilQwen Collection + +This model is part of the **[DistilQwen](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** proof-weighted distillation series. +Collection: **9 models** | **2,788 downloads** + +### Teacher Variant Comparison + +| Teacher | Student Size | Strength | Models | +|---------|-------------|----------|--------| +| Qwen3-30B-A3B (Instruct) | 1.7B | Instruction following, structured output, legal reasoning | 3 (833 DL) | +| Qwen3-30B-A3B (Thinking) | 0.6B | Extended deliberation, higher-entropy distributions, proof derivation | 3 (779 DL) **← this model** | +| Qwen3-30B-A3B (Coder) | 1.7B | Structured decomposition, STEM derivation, logical inference | 2 (825 DL) | + +### Methodology + +**The only BF16 collection in the portfolio.** While the broader Convergent Intelligence catalog (43 models, 12,000+ downloads) was trained on CPU at FP32 for $24 total compute, the DistilQwen series was trained on H100 at BF16 with a 30B-parameter teacher. Same methodology, premium hardware. This is what happens when you give the pipeline real compute. + +All models use proof-weighted knowledge distillation: 55% cross-entropy with decaying proof weights (2.5× → 1.5×), 45% KL divergence at T=2.0. The proof weight amplifies loss on reasoning-critical tokens, forcing the student to allocate capacity to structural understanding rather than surface-level pattern matching. + +Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165) + +### Related in this series + +- [Qwen3-0.6B-Distilled-30B-A3B](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B) (236 downloads) +- [Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-0.6B-Distilled-30B-A3B-Thinking-SFT) (227 downloads) + + + +--- +Part of the [reaperdoesntknow research portfolio](https://huggingface.co/reaperdoesntknow) — 49 models, 22,598 total downloads | Last refreshed: 2026-03-30 12:05 UTC + + diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf new file mode 100644 index 0000000..b92cc6c --- /dev/null +++ b/qwen3-0.6b-distilled-30b-thinking-sft-Q4_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:608e6f5781bfcb34e3d9f7fca9b513f5e5bb058ca8843241f1abe52a1839cf32 +size 484219808 diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf new file mode 100644 index 0000000..663f800 --- /dev/null +++ b/qwen3-0.6b-distilled-30b-thinking-sft-Q5_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cb3cc87de939b9ad08a406ce7434cdcbc4bec01b4b8b31c9e625ed518d92233d +size 551377824 diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf new file mode 100644 index 0000000..f5fe13b --- /dev/null +++ b/qwen3-0.6b-distilled-30b-thinking-sft-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:24c4a94ac8ebc1f8003f1d1f35c1de4d4f6e12727ef30d7f82dd89e3c9e398f6 +size 804753312 diff --git a/qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf b/qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf new file mode 100644 index 0000000..91f7726 --- /dev/null +++ b/qwen3-0.6b-distilled-30b-thinking-sft-f16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:530f4d26cda5f399d54a65903d122b832b74d7127a78468bed213d94385336de +size 1509347232