初始化项目,由ModelHub XC社区提供模型
Model: daniloreddy/DeepSeek-Coder-V2-Lite-Instruct_GGUF Source: Original Platform
This commit is contained in:
41
.gitattributes
vendored
Normal file
41
.gitattributes
vendored
Normal file
@@ -0,0 +1,41 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
DeepSeek-Coder-V2-Lite-Instruct_fp16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
DeepSeek-Coder-V2-Lite-Instruct_Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
DeepSeek-Coder-V2-Lite-Instruct_Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
DeepSeek-Coder-V2-Lite-Instruct_Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
DeepSeek-Coder-V2-Lite-Instruct_Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
3
DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf
Normal file
3
DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:cc48c99234ec5f7ada31c894743ae51a7e90bfab69354ebe138faed565b76f43
|
||||
size 10367958240
|
||||
3
DeepSeek-Coder-V2-Lite-Instruct_Q4_K_S.gguf
Normal file
3
DeepSeek-Coder-V2-Lite-Instruct_Q4_K_S.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7a8913b65905410bec6299d91d3cce45f63f21f33d87e5a97d8ad1eca892b13e
|
||||
size 9537150176
|
||||
3
DeepSeek-Coder-V2-Lite-Instruct_Q5_K_M.gguf
Normal file
3
DeepSeek-Coder-V2-Lite-Instruct_Q5_K_M.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:38eb17e3f919f60c5b253ce7ae13023990bcc80526aeff735ce37f54a097d0cc
|
||||
size 11853085920
|
||||
3
DeepSeek-Coder-V2-Lite-Instruct_Q5_K_S.gguf
Normal file
3
DeepSeek-Coder-V2-Lite-Instruct_Q5_K_S.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a2e0227c60c56f4adfa69b0a69d2701cdf35c02ffe326a604f7c12ac10c4d0d2
|
||||
size 11144830176
|
||||
3
DeepSeek-Coder-V2-Lite-Instruct_Q8_0.gguf
Normal file
3
DeepSeek-Coder-V2-Lite-Instruct_Q8_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:dd30bd47336ba42c2709ee962b98b99a1eaefa581904e15d7f0cebc3f9a5e7e5
|
||||
size 16702520544
|
||||
3
DeepSeek-Coder-V2-Lite-Instruct_fp16.gguf
Normal file
3
DeepSeek-Coder-V2-Lite-Instruct_fp16.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9d9922662e2729d4082ef075cc06d512a8278f9cdaee3fe80633f0a164cae229
|
||||
size 31424036064
|
||||
61
README.md
Normal file
61
README.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
base_model: deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
|
||||
tags:
|
||||
- llama.cpp
|
||||
- gguf
|
||||
- quantized
|
||||
- text-generation
|
||||
- lightweight
|
||||
- lmstudio
|
||||
- jan
|
||||
- cobalt
|
||||
- text-generation-webui
|
||||
---
|
||||
|
||||
# DeepSeek-Coder-V2-Lite-Instruct - GGUF High-Quality Quantizations
|
||||
|
||||
This repository provides **GGUF** quantized versions of the [deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct) model, optimized for local execution using `llama.cpp` and compatible ecosystems.
|
||||
|
||||
## 📌 Version Notes
|
||||
All quantizations were generated from the official **FP16** weights.
|
||||
- **Target:** Efficient execution on consumer hardware, mobile/edge devices, and systems with limited memory.
|
||||
- **Performance:** The output quality (reasoning, coherence, and accuracy) is strictly dependent on the base model's parameter scale (9B).
|
||||
|
||||
## 📊 Quantization Table
|
||||
|
||||
| File | Method | Bit | Description |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **fp16.gguf** | FP16 | 16-bit | **Original Weights.** No quantization applied. Maximum fidelity. |
|
||||
| **Q8_0.gguf** | Q8_0 | 8-bit | **Near-lossless.** Practically identical to the original model with lower memory footprint. |
|
||||
| **Q5_K_M.gguf** | Q5_K_M | 5-bit | **High Precision.** Minimizes quantization error for critical tasks. |
|
||||
| **Q4_K_M.gguf** | Q4_K_M | 4-bit | **Recommended.** Best balance between speed and performance. |
|
||||
| **Q4_K_S.gguf** | Q4_K_S | 4-bit | **Fast/Small.** Optimized for maximum throughput and low RAM usage. |
|
||||
|
||||
## 🛠️ Technical Details
|
||||
- **Quantization Date:** 2026-03-13
|
||||
- **Tool used:** `llama-quantize` (llama.cpp)
|
||||
- **Method:** K-Quantization (optimized for AVX2/AVX-512 and modern GPU architectures).
|
||||
|
||||
## 🚀 How to Use
|
||||
# Start a local OpenAI-compatible server with a web UI:
|
||||
|
||||
### llama.cpp (CLI) using model from HuggingFace
|
||||
```bash
|
||||
./llama-cli -hf daniloreddy/DeepSeek-Coder-V2-Lite-Instruct_GGUF:Q4_K_M -p "User: Hello! Assistant:" -n 512 --temp 0.7
|
||||
```
|
||||
|
||||
### llama.cpp (CLI) using downloaded model
|
||||
```bash
|
||||
./llama-cli -m path/to/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf -p "User: Hello! Assistant:" -n 512 --temp 0.7
|
||||
```
|
||||
|
||||
### llama.cpp (SERVER) using model from HuggingFace
|
||||
```bash
|
||||
./llama-server -hf daniloreddy/DeepSeek-Coder-V2-Lite-Instruct_GGUF:Q4_K_M --port 8080 -c 4096
|
||||
```
|
||||
|
||||
### llama.cpp (SERVER) using downloaded model
|
||||
```bash
|
||||
./llama-server -m /path/to/DeepSeek-Coder-V2-Lite-Instruct_Q4_K_M.gguf --port 8080 -c 4096
|
||||
```
|
||||
Reference in New Issue
Block a user