14f619e6b0df3ae9c9415b22d03b77b4cf803af2
Model: Eclipse-Senpai/KeyLM-75M-Instruct-GGUF Source: Original Platform
license, language, base_model, base_model_relation, pipeline_tag, library_name, tags
| license | language | base_model | base_model_relation | pipeline_tag | library_name | tags | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| apache-2.0 |
|
Eclipse-Senpai/KeyLM-75M-Instruct | quantized | text-generation | gguf |
|
KeyLM-75M-Instruct-GGUF
GGUF builds of KeyLM-75M-Instruct for llama.cpp, LM Studio, Ollama, and other GGUF runtimes.
KeyLM is a 75M-parameter instruction-tuned language model trained from scratch on approximately 18 billion tokens. See the main model card for benchmarks, training details, limitations, and the transformers (safetensors) version.
Files
| File | Quant | Size | Notes |
|---|---|---|---|
KeyLM-75M-Instruct.F16.gguf |
F16 | ~144 MB | Full precision and recommended. The model is already tiny, so there is little reason to quantize further. |
Run with llama.cpp
# straight from the Hub
llama-cli -hf Eclipse-Senpai/KeyLM-75M-Instruct-GGUF -cnv
# or a local file
llama-cli -m KeyLM-75M-Instruct.F16.gguf -cnv
The chat template (User: / Assistant:, assistant turns ending with </s>) is embedded in the GGUF, so conversation mode (-cnv) applies it automatically.
LM Studio / Ollama
- LM Studio: load the
.gguf; the embedded chat template is detected automatically. - Ollama:
ollama run hf.co/Eclipse-Senpai/KeyLM-75M-Instruct-GGUF
Notes & limitations
KeyLM is a tiny model: good at simple instruction following and short chat, near random chance on knowledge/reasoning benchmarks. It is not a factual assistant. Full numbers and caveats are on the main model card.
License
Apache 2.0.
Description