Model: Avtrkrb/granite-claude-h-350m-GGUF Source: Original Platform
license, language, pipeline_tag, tags, base_model, library_name
| license | language | pipeline_tag | tags | base_model | library_name | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| apache-2.0 |
|
text-generation |
|
Avtrkrb/granite-claude-h-350m | gguf |
granite-claude-h-350m-GGUF
GGUF quantizations of:
Avtrkrb/granite-claude-h-350m
These files are intended for inference using:
- llama.cpp
- LM Studio
- Open WebUI
- Jan
- KoboldCpp
- GPT4All
- Ollama (after conversion/import)
Available Quantizations
Typical variants included:
| Quant | Use Case |
|---|---|
| Q4_K_M | Best size / quality balance |
| Q5_K_M | Higher quality |
| Q6_K | Near-lossless for most use cases |
| Q8_0 | Highest quality quantized version |
Source Model
Merged model:
https://huggingface.co/Avtrkrb/granite-claude-h-350m
Dataset:
https://huggingface.co/datasets/Avtrkrb/combined-reasoning-claude
Example llama.cpp Usage
./llama-cli \
-m granite-claude-h-350m-Q4_K_M.gguf \
-p "Explain quantum tunneling."
Recommended Quant
For most users:
Q4_K_M
offers the best balance between:
- quality
- speed
- memory usage
License
This repository follows the licensing terms of the original Granite model.
Description