Files
granite-4.0-h-1b-DISTILL-gl…/README.md
2025-12-23 13:01:11 +00:00

5.2 KiB

base_model, library_name, license, language, tags, quantized_by, pipeline_tag
base_model library_name license language tags quantized_by pipeline_tag
glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7 gguf apache-2.0
en
fr
granite
gguf
quantized
llama.cpp
ollama
llama.cpp text-generation

granite-4.0-h-1b-DISTILL-glm-4.7-GGUF

GGUF quantized versions of granite-4.0-h-1b-DISTILL-glm-4.7

Available Formats

Filename Size Quant Type Description
granite-4.0-h-1b-DISTILL-glm-4.7-f16.gguf 2.73 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-F16
granite-4.0-h-1b-DISTILL-glm-4.7-q2_k.gguf 0.55 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q2_K
granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_l.gguf 0.71 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q3_K_L
granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_m.gguf 0.68 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q3_K_M
granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_s.gguf 0.65 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q3_K_S
granite-4.0-h-1b-DISTILL-glm-4.7-q4_0.gguf 0.81 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_0
granite-4.0-h-1b-DISTILL-glm-4.7-q4_1.gguf 0.88 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_1
granite-4.0-h-1b-DISTILL-glm-4.7-q4_k_m.gguf 0.84 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_K_M
granite-4.0-h-1b-DISTILL-glm-4.7-q4_k_s.gguf 0.81 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_K_S
granite-4.0-h-1b-DISTILL-glm-4.7-q5_0.gguf 0.96 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_0
granite-4.0-h-1b-DISTILL-glm-4.7-q5_1.gguf 1.04 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_1
granite-4.0-h-1b-DISTILL-glm-4.7-q5_k_m.gguf 0.98 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_K_M
granite-4.0-h-1b-DISTILL-glm-4.7-q5_k_s.gguf 0.96 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_K_S
granite-4.0-h-1b-DISTILL-glm-4.7-q6_k.gguf 1.12 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q6_K
granite-4.0-h-1b-DISTILL-glm-4.7-q8_0.gguf 1.45 GB GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q8_0

Quick Start

Ollama

# Use Q4_K_M (recommended)
ollama run hf.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF:Q4_K_M

# Or other quantizations
ollama run hf.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF:Q8_0
ollama run hf.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF:Q2_K

llama.cpp

# Download and run
llama-cli --hf-repo glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF --hf-file granite-4.0-h-1b-distill-glm-4.7-q4_k_m.gguf -p "Hello, how are you?"

# With server
llama-server --hf-repo glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF --hf-file granite-4.0-h-1b-distill-glm-4.7-q4_k_m.gguf -c 2048

LM Studio / GPT4All

Download the .gguf file of your choice and load it in your application.

Quantization Details

Type Bits Use Case
Q2_K 2 Extreme compression, low quality
Q3_K_M 3 Very compressed
Q4_K_M 4 Recommended - Best size/quality
Q5_K_M 5 High quality
Q6_K 6 Very high quality
Q8_0 8 Near lossless
F16 16 Original precision

Original Model

This is the quantized version of granite-4.0-h-1b-DISTILL-glm-4.7

  • Base Model: ibm-granite/granite-4.0-h-1b
  • Fine-tuning Dataset: TeichAI/glm-4.7-2000x
  • Training Loss: 0.6364