license, language, base_model, tags, datasets
| license |
language |
base_model |
tags |
datasets |
| apache-2.0 |
|
TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
| gguf |
| llama-cpp |
| quantized |
| tinyllama |
| lora |
| alpaca |
| text-generation |
|
|
TinyLlama 1.1B — LoRA (Alpaca) — GGUF quantizations
GGUF weights for TinyLlama-1.1B-Chat fine-tuned with LoRA on Alpaca-style instructions (fused HF checkpoint → F16 GGUF → llama-quantize).
Files
| File |
Quantization |
~Size |
model-Q4_K_M.gguf |
Q4_K_M |
~637 MB |
model-Q5_K_M.gguf |
Q5_K_M |
~746 MB |
model-Q8_0.gguf |
Q8_0 |
~1.1 GB |
Usage (llama.cpp)
Provenance
- Base:
TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Conversion:
llama.cpp/convert_hf_to_gguf.py (F16), then llama-quantize
- Chat template is embedded in the GGUF (TinyLlama chat format)
Related