commit 9c30b3ff025a30ec390e7f0a71672f5bd72c211e Author: ModelHub XC Date: Sat Apr 11 13:30:59 2026 +0800 初始化项目,由ModelHub XC社区提供模型 Model: daniloreddy/DeepSeek-Coder-V2-Lite-Instruct_GGUF Source: Original Platform diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..4859226 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,41 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +DeepSeek-Coder-V2-Lite-Instruct_fp16.gguf filter=lfs diff=lfs merge=lfs -text +DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text +DeepSeek-Coder-V2-Lite-Instruct_Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text +DeepSeek-Coder-V2-Lite-Instruct_Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text +DeepSeek-Coder-V2-Lite-Instruct_Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text +DeepSeek-Coder-V2-Lite-Instruct_Q8_0.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf b/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf new file mode 100644 index 0000000..4782c6a --- /dev/null +++ b/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cc48c99234ec5f7ada31c894743ae51a7e90bfab69354ebe138faed565b76f43 +size 10367958240 diff --git a/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_S.gguf b/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_S.gguf new file mode 100644 index 0000000..2f92399 --- /dev/null +++ b/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_S.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a8913b65905410bec6299d91d3cce45f63f21f33d87e5a97d8ad1eca892b13e +size 9537150176 diff --git a/DeepSeek-Coder-V2-Lite-Instruct_Q5_K_M.gguf b/DeepSeek-Coder-V2-Lite-Instruct_Q5_K_M.gguf new file mode 100644 index 0000000..eb3cf44 --- /dev/null +++ b/DeepSeek-Coder-V2-Lite-Instruct_Q5_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:38eb17e3f919f60c5b253ce7ae13023990bcc80526aeff735ce37f54a097d0cc +size 11853085920 diff --git a/DeepSeek-Coder-V2-Lite-Instruct_Q5_K_S.gguf b/DeepSeek-Coder-V2-Lite-Instruct_Q5_K_S.gguf new file mode 100644 index 0000000..3024274 --- /dev/null +++ b/DeepSeek-Coder-V2-Lite-Instruct_Q5_K_S.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a2e0227c60c56f4adfa69b0a69d2701cdf35c02ffe326a604f7c12ac10c4d0d2 +size 11144830176 diff --git a/DeepSeek-Coder-V2-Lite-Instruct_Q8_0.gguf b/DeepSeek-Coder-V2-Lite-Instruct_Q8_0.gguf new file mode 100644 index 0000000..57e9450 --- /dev/null +++ b/DeepSeek-Coder-V2-Lite-Instruct_Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd30bd47336ba42c2709ee962b98b99a1eaefa581904e15d7f0cebc3f9a5e7e5 +size 16702520544 diff --git a/DeepSeek-Coder-V2-Lite-Instruct_fp16.gguf b/DeepSeek-Coder-V2-Lite-Instruct_fp16.gguf new file mode 100644 index 0000000..8ae4a21 --- /dev/null +++ b/DeepSeek-Coder-V2-Lite-Instruct_fp16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d9922662e2729d4082ef075cc06d512a8278f9cdaee3fe80633f0a164cae229 +size 31424036064 diff --git a/README.md b/README.md new file mode 100644 index 0000000..51e9ea1 --- /dev/null +++ b/README.md @@ -0,0 +1,61 @@ +--- +license: apache-2.0 +base_model: deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct +tags: +- llama.cpp +- gguf +- quantized +- text-generation +- lightweight +- lmstudio +- jan +- cobalt +- text-generation-webui +--- + +# DeepSeek-Coder-V2-Lite-Instruct - GGUF High-Quality Quantizations + +This repository provides **GGUF** quantized versions of the [deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct) model, optimized for local execution using `llama.cpp` and compatible ecosystems. + +## 📌 Version Notes +All quantizations were generated from the official **FP16** weights. +- **Target:** Efficient execution on consumer hardware, mobile/edge devices, and systems with limited memory. +- **Performance:** The output quality (reasoning, coherence, and accuracy) is strictly dependent on the base model's parameter scale (9B). + +## 📊 Quantization Table + +| File | Method | Bit | Description | +| :--- | :--- | :--- | :--- | +| **fp16.gguf** | FP16 | 16-bit | **Original Weights.** No quantization applied. Maximum fidelity. | +| **Q8_0.gguf** | Q8_0 | 8-bit | **Near-lossless.** Practically identical to the original model with lower memory footprint. | +| **Q5_K_M.gguf** | Q5_K_M | 5-bit | **High Precision.** Minimizes quantization error for critical tasks. | +| **Q4_K_M.gguf** | Q4_K_M | 4-bit | **Recommended.** Best balance between speed and performance. | +| **Q4_K_S.gguf** | Q4_K_S | 4-bit | **Fast/Small.** Optimized for maximum throughput and low RAM usage. | + +## 🛠️ Technical Details +- **Quantization Date:** 2026-03-13 +- **Tool used:** `llama-quantize` (llama.cpp) +- **Method:** K-Quantization (optimized for AVX2/AVX-512 and modern GPU architectures). + +## 🚀 How to Use +# Start a local OpenAI-compatible server with a web UI: + +### llama.cpp (CLI) using model from HuggingFace +```bash +./llama-cli -hf daniloreddy/DeepSeek-Coder-V2-Lite-Instruct_GGUF:Q4_K_M -p "User: Hello! Assistant:" -n 512 --temp 0.7 +``` + +### llama.cpp (CLI) using downloaded model +```bash +./llama-cli -m path/to/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf -p "User: Hello! Assistant:" -n 512 --temp 0.7 +``` + +### llama.cpp (SERVER) using model from HuggingFace +```bash +./llama-server -hf daniloreddy/DeepSeek-Coder-V2-Lite-Instruct_GGUF:Q4_K_M --port 8080 -c 4096 +``` + +### llama.cpp (SERVER) using downloaded model +```bash +./llama-server -m /path/to/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf --port 8080 -c 4096 +``` \ No newline at end of file