53 lines
1.7 KiB
Markdown
53 lines
1.7 KiB
Markdown
|
|
---
|
||
|
|
license: apache-2.0
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
base_model: Eclipse-Senpai/KeyLM-75M-Instruct
|
||
|
|
base_model_relation: quantized
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
library_name: gguf
|
||
|
|
tags:
|
||
|
|
- keylm
|
||
|
|
- gguf
|
||
|
|
- llama.cpp
|
||
|
|
- small-language-model
|
||
|
|
- instruct
|
||
|
|
---
|
||
|
|
|
||
|
|
# KeyLM-75M-Instruct-GGUF
|
||
|
|
|
||
|
|
GGUF builds of [**KeyLM-75M-Instruct**](https://huggingface.co/Eclipse-Senpai/KeyLM-75M-Instruct) for `llama.cpp`, LM Studio, Ollama, and other GGUF runtimes.
|
||
|
|
|
||
|
|
KeyLM is a 75M-parameter instruction-tuned language model trained from scratch on approximately 18 billion tokens. See the [main model card](https://huggingface.co/Eclipse-Senpai/KeyLM-75M-Instruct) for benchmarks, training details, limitations, and the `transformers` (safetensors) version.
|
||
|
|
|
||
|
|
## Files
|
||
|
|
|
||
|
|
| File | Quant | Size | Notes |
|
||
|
|
|---|---|---|---|
|
||
|
|
| `KeyLM-75M-Instruct.F16.gguf` | F16 | ~144 MB | Full precision and recommended. The model is already tiny, so there is little reason to quantize further. |
|
||
|
|
|
||
|
|
## Run with llama.cpp
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# straight from the Hub
|
||
|
|
llama-cli -hf Eclipse-Senpai/KeyLM-75M-Instruct-GGUF -cnv
|
||
|
|
|
||
|
|
# or a local file
|
||
|
|
llama-cli -m KeyLM-75M-Instruct.F16.gguf -cnv
|
||
|
|
```
|
||
|
|
|
||
|
|
The chat template (`User:` / `Assistant:`, assistant turns ending with `</s>`) is embedded in the GGUF, so conversation mode (`-cnv`) applies it automatically.
|
||
|
|
|
||
|
|
## LM Studio / Ollama
|
||
|
|
|
||
|
|
- **LM Studio:** load the `.gguf`; the embedded chat template is detected automatically.
|
||
|
|
- **Ollama:** `ollama run hf.co/Eclipse-Senpai/KeyLM-75M-Instruct-GGUF`
|
||
|
|
|
||
|
|
## Notes & limitations
|
||
|
|
|
||
|
|
KeyLM is a tiny model: good at simple instruction following and short chat, near random chance on knowledge/reasoning benchmarks. It is not a factual assistant. Full numbers and caveats are on the [main model card](https://huggingface.co/Eclipse-Senpai/KeyLM-75M-Instruct).
|
||
|
|
|
||
|
|
## License
|
||
|
|
|
||
|
|
Apache 2.0.
|