CyberSec-Assistant-3B-GGUF/README.md

---
language:
- fr
- en
license: apache-2.0
library_name: gguf
base_model: Qwen/Qwen2.5-3B-Instruct
tags:
- cybersecurity
- gguf
- quantized
- ollama
- llama-cpp
pipeline_tag: text-generation
---

# CyberSec-Assistant-3B-GGUF

**GGUF quantized versions** of [AYI-NEDJIMI/CyberSec-Assistant-3B](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B) for use with [Ollama](https://ollama.ai), [llama.cpp](https://github.com/ggerganov/llama.cpp), [LM Studio](https://lmstudio.ai), and other GGUF-compatible inference engines.

## Model Description

This is a fine-tuned Qwen2.5-3B-Instruct model specialized in **general cybersecurity**. It can answer questions about network security, vulnerability assessment, incident response, penetration testing, threat analysis, security architecture, and cybersecurity best practices in both French and English.

Part of the **AYI-NEDJIMI Cybersecurity AI Portfolio**:
- [AYI-NEDJIMI/CyberSec-AI-Portfolio](https://huggingface.co/collections/AYI-NEDJIMI/cybersec-ai-portfolio-6850da55c1b0578430f1f553) — Full collection

## Available Quantizations

| Filename | Quant Type | Size | Description |
|---|---|---|---|
| `cybersec-assistant-3b-Q4_K_M.gguf` | Q4_K_M | 1.80 GB | **Recommended** — Best balance of quality and size (~31% of F16) |
| `cybersec-assistant-3b-Q5_K_M.gguf` | Q5_K_M | 2.07 GB | Higher quality, slightly larger (~36% of F16) |
| `cybersec-assistant-3b-Q8_0.gguf` | Q8_0 | 3.06 GB | Near-lossless quantization (~53% of F16) |

### Quantization Format Details

- **Q4_K_M**: 4-bit quantization with k-quant medium quality. Excellent for resource-constrained environments. Minimal quality loss for most tasks.
- **Q5_K_M**: 5-bit quantization with k-quant medium quality. Good middle ground between Q4 and Q8.
- **Q8_0**: 8-bit quantization. Near-original quality with ~50% size reduction from F16.

## How to Use

### Ollama

Create a `Modelfile`:

```
FROM ./cybersec-assistant-3b-Q4_K_M.gguf

TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""

SYSTEM "You are a cybersecurity expert assistant. You provide detailed, accurate guidance on network security, vulnerability assessment, incident response, penetration testing, and security best practices. You respond in the same language as the user's question."

PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
PARAMETER stop "<|im_end|>"
```

Then run:

```bash
ollama create cybersec-assistant -f Modelfile
ollama run cybersec-assistant
```

### llama.cpp

```bash
# Interactive chat
./llama-cli -m cybersec-assistant-3b-Q4_K_M.gguf \
  -p "You are a cybersecurity expert assistant." \
  --chat-template chatml \
  -cnv

# Server mode
./llama-server -m cybersec-assistant-3b-Q4_K_M.gguf \
  --host 0.0.0.0 --port 8080
```

### LM Studio

1. Download the desired GGUF file
2. Open LM Studio and load the model from your downloads
3. Select the **ChatML** chat template
4. Set the system prompt to: "You are a cybersecurity expert assistant."
5. Start chatting!

### Python (llama-cpp-python)

```python
from llama_cpp import Llama

llm = Llama(model_path="cybersec-assistant-3b-Q4_K_M.gguf", n_ctx=4096)

response = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": "You are a cybersecurity expert assistant."},
        {"role": "user", "content": "Explain the MITRE ATT&CK framework and how it helps in threat detection."}
    ],
    temperature=0.7,
    top_p=0.8,
    top_k=20,
)
print(response["choices"][0]["message"]["content"])
```

## Related Models

| Version | Link |
|---|---|
| Merged (SafeTensors) | [AYI-NEDJIMI/CyberSec-Assistant-3B](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B) |
| LoRA Adapter | [AYI-NEDJIMI/CyberSec-Assistant-3B-Adapter](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B-Adapter) |
| GGUF (this repo) | [AYI-NEDJIMI/CyberSec-Assistant-3B-GGUF](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B-GGUF) |
| Portfolio Collection | [AYI-NEDJIMI/CyberSec-AI-Portfolio](https://huggingface.co/collections/AYI-NEDJIMI/cybersec-ai-portfolio-6850da55c1b0578430f1f553) |

## Technical Details

- **Base Model**: Qwen/Qwen2.5-3B-Instruct
- **Fine-tuning**: QLoRA (4-bit) with LoRA adapters merged back
- **Architecture**: Qwen2ForCausalLM
- **Context Length**: 4096 tokens
- **Chat Template**: ChatML
- **Converted with**: llama.cpp (convert_hf_to_gguf.py)
初始化项目，由ModelHub XC社区提供模型 Model: s0ck3t/CyberSec-Assistant-3B-GGUF Source: Original Platform 2026-04-12 16:28:55 +08:00			`---`
			`language:`
			`- fr`
			`- en`
			`license: apache-2.0`
			`library_name: gguf`
			`base_model: Qwen/Qwen2.5-3B-Instruct`
			`tags:`
			`- cybersecurity`
			`- gguf`
			`- quantized`
			`- ollama`
			`- llama-cpp`
			`pipeline_tag: text-generation`
			`---`

			`# CyberSec-Assistant-3B-GGUF`

			`GGUF quantized versions of [AYI-NEDJIMI/CyberSec-Assistant-3B](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B) for use with [Ollama](https://ollama.ai), [llama.cpp](https://github.com/ggerganov/llama.cpp), [LM Studio](https://lmstudio.ai), and other GGUF-compatible inference engines.`

			`## Model Description`

			`This is a fine-tuned Qwen2.5-3B-Instruct model specialized in general cybersecurity. It can answer questions about network security, vulnerability assessment, incident response, penetration testing, threat analysis, security architecture, and cybersecurity best practices in both French and English.`

			`Part of the AYI-NEDJIMI Cybersecurity AI Portfolio:`
			`- [AYI-NEDJIMI/CyberSec-AI-Portfolio](https://huggingface.co/collections/AYI-NEDJIMI/cybersec-ai-portfolio-6850da55c1b0578430f1f553) — Full collection`

			`## Available Quantizations`

			`\| Filename \| Quant Type \| Size \| Description \|`
			`\|---\|---\|---\|---\|`
			\| `cybersec-assistant-3b-Q4_K_M.gguf` \| Q4_K_M \| 1.80 GB \| Recommended — Best balance of quality and size (~31% of F16) \|
			\| `cybersec-assistant-3b-Q5_K_M.gguf` \| Q5_K_M \| 2.07 GB \| Higher quality, slightly larger (~36% of F16) \|
			\| `cybersec-assistant-3b-Q8_0.gguf` \| Q8_0 \| 3.06 GB \| Near-lossless quantization (~53% of F16) \|

			`### Quantization Format Details`

			`- Q4_K_M: 4-bit quantization with k-quant medium quality. Excellent for resource-constrained environments. Minimal quality loss for most tasks.`
			`- Q5_K_M: 5-bit quantization with k-quant medium quality. Good middle ground between Q4 and Q8.`
			`- Q8_0: 8-bit quantization. Near-original quality with ~50% size reduction from F16.`

			`## How to Use`

			`### Ollama`

			Create a `Modelfile`:

			```
			`FROM ./cybersec-assistant-3b-Q4_K_M.gguf`

			`TEMPLATE """<\|im_start\|>system`
			`{{ .System }}<\|im_end\|>`
			`<\|im_start\|>user`
			`{{ .Prompt }}<\|im_end\|>`
			`<\|im_start\|>assistant`
			`"""`

			`SYSTEM "You are a cybersecurity expert assistant. You provide detailed, accurate guidance on network security, vulnerability assessment, incident response, penetration testing, and security best practices. You respond in the same language as the user's question."`

			`PARAMETER temperature 0.7`
			`PARAMETER top_p 0.8`
			`PARAMETER top_k 20`
			`PARAMETER stop "<\|im_end\|>"`
			```

			`Then run:`

			```bash
			`ollama create cybersec-assistant -f Modelfile`
			`ollama run cybersec-assistant`
			```

			`### llama.cpp`

			```bash
			`# Interactive chat`
			`./llama-cli -m cybersec-assistant-3b-Q4_K_M.gguf \`
			`-p "You are a cybersecurity expert assistant." \`
			`--chat-template chatml \`
			`-cnv`

			`# Server mode`
			`./llama-server -m cybersec-assistant-3b-Q4_K_M.gguf \`
			`--host 0.0.0.0 --port 8080`
			```

			`### LM Studio`

			`1. Download the desired GGUF file`
			`2. Open LM Studio and load the model from your downloads`
			`3. Select the ChatML chat template`
			`4. Set the system prompt to: "You are a cybersecurity expert assistant."`
			`5. Start chatting!`

			`### Python (llama-cpp-python)`

			```python
			`from llama_cpp import Llama`

			`llm = Llama(model_path="cybersec-assistant-3b-Q4_K_M.gguf", n_ctx=4096)`

			`response = llm.create_chat_completion(`
			`messages=[`
			`{"role": "system", "content": "You are a cybersecurity expert assistant."},`
			`{"role": "user", "content": "Explain the MITRE ATT&CK framework and how it helps in threat detection."}`
			`],`
			`temperature=0.7,`
			`top_p=0.8,`
			`top_k=20,`
			`)`
			`print(response["choices"][0]["message"]["content"])`
			```

			`## Related Models`

			`\| Version \| Link \|`
			`\|---\|---\|`
			`\| Merged (SafeTensors) \| [AYI-NEDJIMI/CyberSec-Assistant-3B](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B) \|`
			`\| LoRA Adapter \| [AYI-NEDJIMI/CyberSec-Assistant-3B-Adapter](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B-Adapter) \|`
			`\| GGUF (this repo) \| [AYI-NEDJIMI/CyberSec-Assistant-3B-GGUF](https://huggingface.co/AYI-NEDJIMI/CyberSec-Assistant-3B-GGUF) \|`
			`\| Portfolio Collection \| [AYI-NEDJIMI/CyberSec-AI-Portfolio](https://huggingface.co/collections/AYI-NEDJIMI/cybersec-ai-portfolio-6850da55c1b0578430f1f553) \|`

			`## Technical Details`

			`- Base Model: Qwen/Qwen2.5-3B-Instruct`
			`- Fine-tuning: QLoRA (4-bit) with LoRA adapters merged back`
			`- Architecture: Qwen2ForCausalLM`
			`- Context Length: 4096 tokens`
			`- Chat Template: ChatML`
			`- Converted with: llama.cpp (convert_hf_to_gguf.py)`