Files

95 lines
5.8 KiB
Markdown
Raw Permalink Normal View History

---
license: apache-2.0
tags:
- gguf
- text-generation
- quantized
- cpu
- gpu
- magicquant
- magic_quant
- qwen3
- qwen
- conversational
base_model:
- unsloth/Qwen3-4B-Instruct-2507
---
# MagicQuant Hybrids (v2.0) - Qwen3-4B-Instruct-2507-unsloth
MagicQuant is **not** a quantization technique by itself.
It is a search, judging, and hybrid-discovery system that learns from baseline families such as llama.cpp and external/custom baseline sources, then uses isolated samples, rank-safe prediction, and real benchmarking to keep the practical survivors.
Sometimes a hybrid beats a pure baseline. Sometimes it does not. MagicQuant finds non linear good trades to discover potential better hybrids, good sub spaces between anchor baselines and more.
Read more on the [MagicQuant Wiki Here](https://github.com/magiccodingman/MagicQuant-Wiki).
## Final surviving downloadable outputs
| Name | Provider | Quant Family | KLD | Size (GB) | Download |
|---|---|---|---:|---:|---|
| LM-Q8_0 | llama.cpp | Q8_0 | 0.001339 | 3.99 | [Link](./../../resolve/main/Model-LM-Q8_0.gguf?download=true) |
| MQ-Q6_K_1 | MagicQuant | Q6_K | 0.001817 | 3.58 | [Link](./../../resolve/main/Model-MQ-Q6_K_1.gguf?download=true) |
| UD-Q6_K_XL | Unsloth | UD-Q6_K_XL | 0.002111 | 3.41 | [Link](https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-GGUF) |
| LM-Q6_K | llama.cpp | Q6_K | 0.004640 | 3.08 | [Link](./../../resolve/main/Model-LM-Q6_K.gguf?download=true) |
| [<u>MQ-Q5_K_1</u>](#winner-notes "Replaced: MQ-Q5_K") | MagicQuant | Q5_K | 0.006632 | 2.88 | [Link](./../../resolve/main/Model-MQ-Q5_K_1.gguf?download=true) |
| [<u>UD-Q5_K_XL</u>](#winner-notes "Replaced: LM-Q5_K, LM-Q5_K_S") | Unsloth | UD-Q5_K_XL | 0.009839 | 2.73 | [Link](https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-GGUF) |
| [<u>MQ-Q4_K_M_1</u>](#winner-notes "Replaced: MQ-Q4_K_M, UD-Q4_K_XL, LM-Q4_K_M + 1 more") | MagicQuant | Q4_K_M | 0.020346 | 2.44 | [Link](./../../resolve/main/Model-MQ-Q4_K_M_1.gguf?download=true) |
| [<u>LM-Q4_K_S</u>](#winner-notes "Replaced: LM-IQ4_NL") | llama.cpp | Q4_K_S | 0.029803 | 2.22 | [Link](./../../resolve/main/Model-LM-Q4_K_S.gguf?download=true) |
| LM-IQ4_XS | llama.cpp | IQ4_XS | 0.031300 | 2.11 | [Link](./../../resolve/main/Model-LM-IQ4_XS.gguf?download=true) |
| UD-Q3_K_XL | Unsloth | UD-Q3_K_XL | 0.072278 | 1.98 | [Link](https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-GGUF) |
| [<u>LM-IQ3_S</u>](#winner-notes "Replaced: LM-IQ3_XS") | llama.cpp | IQ3_S | 0.091992 | 1.77 | [Link](./../../resolve/main/Model-LM-IQ3_S.gguf?download=true) |
| LM-IQ3_XXS | llama.cpp | IQ3_XXS | 0.190404 | 1.56 | [Link](./../../resolve/main/Model-LM-IQ3_XXS.gguf?download=true) |
| [<u>LM-IQ2_S</u>](#winner-notes "Replaced: LM-IQ2_XS") | llama.cpp | IQ2_S | 0.431128 | 1.32 | [Link](./../../resolve/main/Model-LM-IQ2_S.gguf?download=true) |
| LM-IQ2_XXS | llama.cpp | IQ2_XXS | 0.938021 | 1.16 | [Link](./../../resolve/main/Model-LM-IQ2_XXS.gguf?download=true) |
---
## Release metadata
- [Final survivor metrics](./../../resolve/main/magicquant.final-survivors.json?download=true) — full file names, KLD, PPL delta %, byte sizes, download targets, and replacement lineage. PPL delta % is measured against the native/reference PPL when available; negative is better and larger positive values are worse.
- [Hybrid tensor map](./../../resolve/main/magicquant.hybrid-map.json?download=true) — tensor-group assignments and effective-state details for MagicQuant hybrid GGUFs.
- [Replacement details](./../../resolve/main/magicquant.replacements.json?download=true) — structured details for baselines or anchors removed from the final download table, including reason codes, KLD deltas, PPL delta %, and size deltas.
---
<details>
<summary>Replacement reason codes</summary>
- `STRICT_DOMINANCE` — the winner was no larger and had lower real KLD than the removed anchor.
- `NEAR_BASELINE_PREMIUM` — the winner used only the configured near-baseline size premium and beat the real linear KLD trade line.
- `INTERIOR_DISCOVERY` — the winner was selected as a useful interior point inside a size/KLD gap between anchors.
- `SPACING_COLLAPSE` — two candidates were too close in practical output space, so the stronger one was kept.
- `FINAL_DOMINANCE` — a later validated survivor dominated this artifact in final real benchmark comparison.
<a id="winner-notes"></a>
Underlined names in the table replaced or ultimately inherited the replacement of another artifact. Hover the name for the short replacement summary, or inspect `magicquant.replacements.json` for exact KLD/PPL/size deltas.
</details>
<details>
<summary>Provider credits</summary>
- [llama.cpp](https://github.com/ggml-org/llama.cpp) — Baseline quantization formats and llama.cpp tooling.
- [Unsloth](https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-GGUF) — External learned baseline source (UD).
</details>
<details>
<summary>Warning</summary>
External/custom baselines are normalized into MagicQuant's controlled comparison flow. MagicQuant may rebuild a learned baseline under native-source / MagicQuant-controlled conditions, including its own imatrix handling, so hybrids can be judged on a more equal footing. That does **not** mean MagicQuant proved the original upstream artifact or upstream imatrix was worse. These comparisons exist for internal hybrid-search consistency, not as a universal judgment of the original creator's exact release artifact.
</details>
---
## Support
Im a solo developer working full time for myself to achieve my dream. I build open source code on the side. If you like any of my work, buying me a coffee is always appreciated. Otherwise, I hope you enjoy, maybe give me a star or something. Or just send me good vibes. Either way, thank you!
[Click here to see ways to support](https://sayou.biz/support) - BTC, Paypal, GitHub sponsors.