merged-Gensyn-Qwen2.5-1.5B-…/README.md

---
license: apache-2.0
tags:
- text-generation
- llama.cpp
- gguf
- quantization
- merged-model
language:
- en
library_name: gguf
---

# merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B - GGUF Quantized Model

This is a collection of GGUF quantized versions of [pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B).

## 🌳 Model Tree

This model was created by merging the following models:

```
pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B
├── Merge Method: dare_ties
├── Gensyn/Qwen2.5-1.5B-Instruct
└── deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
    ├── density: 0.6
    ├── weight: 0.5
```

**Merge Method**: DARE_TIES - Advanced merging technique that reduces interference between models


## 📊 Available Quantization Formats

This repository contains multiple quantization formats optimized for different use cases:

- **q4_k_m**: 4-bit quantization, medium quality, good balance of size and performance
- **q5_k_m**: 5-bit quantization, higher quality, slightly larger size
- **q8_0**: 8-bit quantization, highest quality, larger size but minimal quality loss

## 🚀 Usage

### With llama.cpp

```bash
# Download a specific quantization
wget https://huggingface.co/pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B/resolve/main/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B.q4_k_m.gguf

# Run with llama.cpp
./main -m merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B.q4_k_m.gguf -p "Your prompt here"
```

### With Python (llama-cpp-python)

```python
from llama_cpp import Llama

# Load the model
llm = Llama(model_path="merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B.q4_k_m.gguf")

# Generate text
output = llm("Your prompt here", max_tokens=512)
print(output['choices'][0]['text'])
```

### With Ollama

```bash
# Create a Modelfile
echo 'FROM ./merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B.q4_k_m.gguf' > Modelfile

# Create and run the model
ollama create merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B -f Modelfile
ollama run merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B "Your prompt here"
```

## 📋 Model Details

- **Original Model**: [pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/pravdin/merged-Gensyn-Qwen2.5-1.5B-Instruct-deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B)
- **Quantization Tool**: [llama.cpp](https://github.com/ggerganov/llama.cpp)
- **License**: Same as original model
- **Use Cases**: Optimized for local inference, edge deployment, and resource-constrained environments

## 🎯 Recommended Usage

- **q4_k_m**: Best for most use cases, good quality/size trade-off
- **q5_k_m**: When you need higher quality and have more storage/memory
- **q8_0**: When you want minimal quality loss from the original model

## ⚡ Performance Notes

GGUF models are optimized for:
- Faster loading times
- Lower memory usage
- CPU and GPU inference
- Cross-platform compatibility

For best performance, ensure your hardware supports the quantization format you choose.

---

*This model was automatically quantized using the Lemuru LLM toolkit.*