Files
Kiel-Pro-0.5B-v3-GGUF/README.md
ModelHub XC e1af997e03 初始化项目,由ModelHub XC社区提供模型
Model: AksaraLLM/Kiel-Pro-0.5B-v3-GGUF
Source: Original Platform
2026-06-16 17:27:18 +08:00

63 lines
2.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
language:
- id
license: apache-2.0
library_name: gguf
pipeline_tag: text-generation
base_model: AksaraLLM/Kiel-Pro-0.5B-v3
tags:
- gguf
- llama.cpp
- ollama
- indonesian
- aksarallm
- qwen2
---
# Kiel-Pro-0.5B-v3-GGUF
GGUF quantizations of [`AksaraLLM/Kiel-Pro-0.5B-v3`](https://huggingface.co/AksaraLLM/Kiel-Pro-0.5B-v3) for inference with [llama.cpp](https://github.com/ggml-org/llama.cpp), [Ollama](https://ollama.ai), [LM Studio](https://lmstudio.ai), and other GGUF runtimes.
## Files
| File | Quant | Size | Recommended use |
|---|---|---|---|
| `Kiel-Pro-0.5B-v3.f16.gguf` | F16 | 0.99 GB | lossless from safetensors |
| `Kiel-Pro-0.5B-v3.q8_0.gguf` | Q8_0 | 0.53 GB | near-lossless, ~2× smaller |
| `Kiel-Pro-0.5B-v3.q6_k.gguf` | Q6_K | 0.51 GB | high quality, ~2.5× smaller |
| `Kiel-Pro-0.5B-v3.q5_k_m.gguf` | Q5_K_M | 0.42 GB | good quality, ~3× smaller |
| `Kiel-Pro-0.5B-v3.q4_k_m.gguf` | Q4_K_M | 0.40 GB | recommended default, ~4× smaller |
## CPU benchmark (AMD EPYC 7763, 2 threads, AVX2)
| Quant | Prompt eval (32 tok) | Generation (16 tok) |
|---|---:|---:|
| `q4_k_m` | **36.7 tok/s** | **20.1 tok/s** |
So a 494M model at q4_k_m runs comfortably on a CPU laptop. Larger quants (q5_k_m, q6_k, q8_0) trade a bit of speed for better quality.
## Quick start — llama.cpp
```bash
huggingface-cli download AksaraLLM/Kiel-Pro-0.5B-v3-GGUF Kiel-Pro-0.5B-v3.q4_k_m.gguf --local-dir .
./llama-cli -m Kiel-Pro-0.5B-v3.q4_k_m.gguf -p "Indonesia adalah" -n 64
```
## Quick start — Ollama
```bash
huggingface-cli download AksaraLLM/Kiel-Pro-0.5B-v3-GGUF Kiel-Pro-0.5B-v3.q4_k_m.gguf Modelfile --local-dir .
ollama create aksara-kiel-pro-0.5b-v3 -f Modelfile
ollama run aksara-kiel-pro-0.5b-v3 "Apa ibukota Indonesia?"
```
## Source model
See [`AksaraLLM/Kiel-Pro-0.5B-v3`](https://huggingface.co/AksaraLLM/Kiel-Pro-0.5B-v3) for architecture, training data, eval results, and limitations.
## Conversion provenance
- Converted with [`convert_hf_to_gguf.py`](https://github.com/ggml-org/llama.cpp/blob/master/convert_hf_to_gguf.py) from llama.cpp
- Quantized with `llama-quantize` from the same build
- Architecture detected as `qwen2`
- All files listed above are reproducible from the source HF safetensors