53 lines
1.0 KiB
Markdown
53 lines
1.0 KiB
Markdown
|
|
---
|
||
|
|
language: en
|
||
|
|
license: apache-2.0
|
||
|
|
library_name: gguf
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
tags:
|
||
|
|
- text-generation
|
||
|
|
- gguf
|
||
|
|
- llama.cpp
|
||
|
|
- ollama
|
||
|
|
base_model: YoAbriel/KodaLite-1.3B
|
||
|
|
---
|
||
|
|
|
||
|
|
# KodaLite-1.3B — GGUF quantizations
|
||
|
|
|
||
|
|
GGUF versions of [YoAbriel/KodaLite-1.3B](https://huggingface.co/YoAbriel/KodaLite-1.3B).
|
||
|
|
|
||
|
|
## Files
|
||
|
|
|
||
|
|
| File | Quant | Size | Use case |
|
||
|
|
|---|---|---|---|
|
||
|
|
| kodalite-f16.gguf | F16 | ~2.5 GB | Full precision reference |
|
||
|
|
| kodalite-Q8_0.gguf | Q8_0 | ~1.3 GB | Near-lossless |
|
||
|
|
| kodalite-Q4_K_M.gguf | Q4_K_M | ~800 MB | Best size/quality tradeoff |
|
||
|
|
|
||
|
|
## Usage
|
||
|
|
|
||
|
|
### llama.cpp
|
||
|
|
```bash
|
||
|
|
llama-cli -m kodalite-Q4_K_M.gguf --reverse-prompt '<|end|>' -p '<|user|>\nHello\n<|assistant|>\n' -n 150
|
||
|
|
```
|
||
|
|
|
||
|
|
### Ollama
|
||
|
|
```bash
|
||
|
|
cat > Modelfile << EOF
|
||
|
|
FROM ./kodalite-Q4_K_M.gguf
|
||
|
|
TEMPLATE """<|user|>
|
||
|
|
{{ .Prompt }}
|
||
|
|
<|assistant|>
|
||
|
|
"""
|
||
|
|
PARAMETER stop "<|end|>"
|
||
|
|
EOF
|
||
|
|
ollama create kodalite -f Modelfile
|
||
|
|
ollama run kodalite
|
||
|
|
```
|
||
|
|
|
||
|
|
### LM Studio / Jan
|
||
|
|
Load the .gguf file directly. Stop sequence: <|end|>
|
||
|
|
|
||
|
|
## License
|
||
|
|
|
||
|
|
Apache 2.0
|