license, language, pipeline_tag, tags, base_model, library_name
license language pipeline_tag tags base_model library_name
apache-2.0
en
text-generation
granite
gguf
llama-cpp
reasoning
quantized
local-llm
Avtrkrb/granite-claude-h-350m gguf

granite-claude-h-350m-GGUF

GGUF quantizations of:

Avtrkrb/granite-claude-h-350m

These files are intended for inference using:

  • llama.cpp
  • LM Studio
  • Open WebUI
  • Jan
  • KoboldCpp
  • GPT4All
  • Ollama (after conversion/import)

Available Quantizations

Typical variants included:

Quant Use Case
Q4_K_M Best size / quality balance
Q5_K_M Higher quality
Q6_K Near-lossless for most use cases
Q8_0 Highest quality quantized version

Source Model

Merged model:

https://huggingface.co/Avtrkrb/granite-claude-h-350m

Dataset:

https://huggingface.co/datasets/Avtrkrb/combined-reasoning-claude


Example llama.cpp Usage

./llama-cli \
  -m granite-claude-h-350m-Q4_K_M.gguf \
  -p "Explain quantum tunneling."

For most users:

Q4_K_M

offers the best balance between:

  • quality
  • speed
  • memory usage

License

This repository follows the licensing terms of the original Granite model.

Description
Model synced from source: Avtrkrb/granite-claude-h-350m-GGUF
Readme 25 KiB