Ternary-Bonsai-8B-GGUF-llamacpp-compatible

Minarut/Ternary-Bonsai-8B-GGUF-llamacpp-compatible

Go to file

ModelHub XC af6d1ad05f 初始化项目，由ModelHub XC社区提供模型

Model: Minarut/Ternary-Bonsai-8B-GGUF-llamacpp-compatible
Source: Original Platform

2026-06-19 07:11:18 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-06-19 07:11:18 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-06-19 07:11:18 +08:00

Ternary-Bonsai-8B-Q4_0-lossless.gguf

初始化项目，由ModelHub XC社区提供模型

2026-06-19 07:11:18 +08:00

Ternary-Bonsai-8B-TQ2_0-Q6out.gguf

初始化项目，由ModelHub XC社区提供模型

2026-06-19 07:11:18 +08:00

Ternary-Bonsai-8B-TQ2_0.gguf

初始化项目，由ModelHub XC社区提供模型

2026-06-19 07:11:18 +08:00

README.md

base_model, pipeline_tag

base_model

pipeline_tag

prism-ml/Ternary-Bonsai-8B-gguf

text-generation

Note

Why this repo? The original prism-ml/Ternary-Bonsai-8B-gguf requires a custom fork to run. This repository contains the modified GGUF files that work directly with the upstream llama.cpp out of the box using native ternary weights.

📦 Available Files

Ternary-Bonsai-8B-TQ2_0 (2.12 GB) - ⭐ RECOMMENDED Provides great results with an almost imperceptible difference in quality compared to the others. Best balance of size and performance.
Ternary-Bonsai-8B-TQ2_0-Q6out (2.47 GB) Slightly faster than the standard TQ2_0 (uses Q6 output tensors, speedup because the llama ternary kernels are terrible), but the difference is barely noticeable (~12%, 16 tps vs 18 tps on cpu).
Ternary-Bonsai-8B-Q4_0-lossless (4.61 GB) Practically meaningless for daily use. It's just the ternary format packed into standard 4-bit, taking up 4 GB of space without any real benefits.

📊 Benchmarks

Note: Im lazy so i haven't run a full benchmark suite on every model yet, so here are the only GPQA diamond results. If you run your own benchmarks, feel free to share them in the Community tab and I will add them!

Model	Size	GPQA Acc	Correct	Physics	Chemistry	Biology
TQ2_0	2.12 GB	34.34%	68/198	32/86	26/93	10/19
TQ2_0-Q6out	2.47 GB	34.34%	68/198	32/86	26/93	10/19
Q4_0-lossless	4.61 GB	34.34%	68/198	36/86	23/93	9/19