89 lines
5.2 KiB
Markdown
89 lines
5.2 KiB
Markdown
---
|
|
base_model: glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7
|
|
library_name: gguf
|
|
license: apache-2.0
|
|
language:
|
|
- en
|
|
- fr
|
|
tags:
|
|
- granite
|
|
- gguf
|
|
- quantized
|
|
- llama.cpp
|
|
- ollama
|
|
quantized_by: llama.cpp
|
|
pipeline_tag: text-generation
|
|
---
|
|
|
|
# granite-4.0-h-1b-DISTILL-glm-4.7-GGUF
|
|
|
|
GGUF quantized versions of [granite-4.0-h-1b-DISTILL-glm-4.7](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7)
|
|
|
|
## Available Formats
|
|
|
|
| Filename | Size | Quant Type | Description |
|
|
|----------|------|------------|-------------|
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-f16.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-f16.gguf) | 2.73 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-F16 | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q2_k.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q2_k.gguf) | 0.55 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q2_K | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_l.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_l.gguf) | 0.71 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q3_K_L | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_m.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_m.gguf) | 0.68 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q3_K_M | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_s.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q3_k_s.gguf) | 0.65 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q3_K_S | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q4_0.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q4_0.gguf) | 0.81 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_0 | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q4_1.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q4_1.gguf) | 0.88 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_1 | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q4_k_m.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q4_k_m.gguf) | 0.84 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_K_M | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q4_k_s.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q4_k_s.gguf) | 0.81 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q4_K_S | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q5_0.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q5_0.gguf) | 0.96 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_0 | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q5_1.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q5_1.gguf) | 1.04 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_1 | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q5_k_m.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q5_k_m.gguf) | 0.98 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_K_M | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q5_k_s.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q5_k_s.gguf) | 0.96 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q5_K_S | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q6_k.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q6_k.gguf) | 1.12 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q6_K | |
|
|
| [granite-4.0-h-1b-DISTILL-glm-4.7-q8_0.gguf](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF/blob/main/granite-4.0-h-1b-DISTILL-glm-4.7-q8_0.gguf) | 1.45 GB | GRANITE-4.0-H-1B-DISTILL-GLM-4.7-Q8_0 | |
|
|
|
|
|
|
## Quick Start
|
|
|
|
### Ollama
|
|
|
|
```bash
|
|
# Use Q4_K_M (recommended)
|
|
ollama run hf.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF:Q4_K_M
|
|
|
|
# Or other quantizations
|
|
ollama run hf.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF:Q8_0
|
|
ollama run hf.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF:Q2_K
|
|
```
|
|
|
|
### llama.cpp
|
|
|
|
```bash
|
|
# Download and run
|
|
llama-cli --hf-repo glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF --hf-file granite-4.0-h-1b-distill-glm-4.7-q4_k_m.gguf -p "Hello, how are you?"
|
|
|
|
# With server
|
|
llama-server --hf-repo glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7-GGUF --hf-file granite-4.0-h-1b-distill-glm-4.7-q4_k_m.gguf -c 2048
|
|
```
|
|
|
|
### LM Studio / GPT4All
|
|
|
|
Download the `.gguf` file of your choice and load it in your application.
|
|
|
|
## Quantization Details
|
|
|
|
| Type | Bits | Use Case |
|
|
|------|------|----------|
|
|
| Q2_K | 2 | Extreme compression, low quality |
|
|
| Q3_K_M | 3 | Very compressed |
|
|
| Q4_K_M | 4 | **Recommended** - Best size/quality |
|
|
| Q5_K_M | 5 | High quality |
|
|
| Q6_K | 6 | Very high quality |
|
|
| Q8_0 | 8 | Near lossless |
|
|
| F16 | 16 | Original precision |
|
|
|
|
## Original Model
|
|
|
|
This is the quantized version of [granite-4.0-h-1b-DISTILL-glm-4.7](https://huggingface.co/glogwa68/granite-4.0-h-1b-DISTILL-glm-4.7)
|
|
|
|
- **Base Model:** ibm-granite/granite-4.0-h-1b
|
|
- **Fine-tuning Dataset:** TeichAI/glm-4.7-2000x
|
|
- **Training Loss:** 0.6364
|