From 521db1424743b05935291fffdb38020c77af8694 Mon Sep 17 00:00:00 2001 From: ModelHub XC Date: Sun, 12 Apr 2026 19:25:57 +0800 Subject: [PATCH] =?UTF-8?q?=E5=88=9D=E5=A7=8B=E5=8C=96=E9=A1=B9=E7=9B=AE?= =?UTF-8?q?=EF=BC=8C=E7=94=B1ModelHub=20XC=E7=A4=BE=E5=8C=BA=E6=8F=90?= =?UTF-8?q?=E4=BE=9B=E6=A8=A1=E5=9E=8B?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Model: ysingh-aiml/tinyllama-alpaca-lora-gguf Source: Original Platform --- .gitattributes | 38 ++++++++++++++++++++++++++++++++++++++ README.md | 46 ++++++++++++++++++++++++++++++++++++++++++++++ model-Q4_K_M.gguf | 3 +++ model-Q5_K_M.gguf | 3 +++ model-Q8_0.gguf | 3 +++ 5 files changed, 93 insertions(+) create mode 100644 .gitattributes create mode 100644 README.md create mode 100644 model-Q4_K_M.gguf create mode 100644 model-Q5_K_M.gguf create mode 100644 model-Q8_0.gguf diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..ca1ec84 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,38 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +model-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text +model-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text +model-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/README.md b/README.md new file mode 100644 index 0000000..1faf59e --- /dev/null +++ b/README.md @@ -0,0 +1,46 @@ +--- +license: apache-2.0 +language: + - en +base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 +tags: + - gguf + - llama-cpp + - quantized + - tinyllama + - lora + - alpaca + - text-generation +datasets: + - tatsu-lab/alpaca +--- + +# TinyLlama 1.1B — LoRA (Alpaca) — GGUF quantizations + +GGUF weights for **TinyLlama-1.1B-Chat** fine-tuned with **LoRA** on Alpaca-style instructions (fused HF checkpoint → F16 GGUF → `llama-quantize`). + +## Files + +| File | Quantization | ~Size | +|------|----------------|-------| +| `model-Q4_K_M.gguf` | Q4_K_M | ~637 MB | +| `model-Q5_K_M.gguf` | Q5_K_M | ~746 MB | +| `model-Q8_0.gguf` | Q8_0 | ~1.1 GB | + +## Usage (llama.cpp) + +```bash +llama-cli -m model-Q4_K_M.gguf -p "Hello" -n 128 +# or +llama-server -m model-Q4_K_M.gguf --host 0.0.0.0 --port 8080 +``` + +## Provenance + +- Base: [`TinyLlama/TinyLlama-1.1B-Chat-v1.0`](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) +- Conversion: `llama.cpp/convert_hf_to_gguf.py` (F16), then `llama-quantize` +- Chat template is embedded in the GGUF (TinyLlama chat format) + +## Related + +- Benchmark Space: [`ysingh-aiml/tinyllama-quantization-gguf`](https://huggingface.co/spaces/ysingh-aiml/tinyllama-quantization-gguf) diff --git a/model-Q4_K_M.gguf b/model-Q4_K_M.gguf new file mode 100644 index 0000000..316a503 --- /dev/null +++ b/model-Q4_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:11ecf83c290153335dc632095d5f281af53f776fd430f2b194a4740d32b42dc5 +size 667816672 diff --git a/model-Q5_K_M.gguf b/model-Q5_K_M.gguf new file mode 100644 index 0000000..b07891d --- /dev/null +++ b/model-Q5_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ebaa13b8821c87a343c9d56140aa977d6ce3198f0e61622e49ba6734d7609206 +size 782045920 diff --git a/model-Q8_0.gguf b/model-Q8_0.gguf new file mode 100644 index 0000000..c5dfe9f --- /dev/null +++ b/model-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:416d5c03784e59b0a26b04a31e1f627c100adffda561061485a02ec6f7ec91c0 +size 1169810144