739 lines
27 KiB
Markdown
739 lines
27 KiB
Markdown
---
|
|
language:
|
|
- en
|
|
license: llama3.1
|
|
library_name: transformers
|
|
tags:
|
|
- GGUF
|
|
- llama
|
|
- llama-cpp
|
|
- unsloth
|
|
- cybersecurity
|
|
- pentesting
|
|
- security
|
|
- abliterated
|
|
- uncensored
|
|
- ethical-hacking
|
|
- red-team
|
|
- blue-team
|
|
- infosec
|
|
- offensive-security
|
|
- CTF
|
|
- bug-bounty
|
|
- conversational
|
|
model_name: Dolphin3-Cyber-8B-GGUF
|
|
base_model: huihui-ai/Dolphin3.0-Llama3.1-8B-abliterated
|
|
pipeline_tag: text-generation
|
|
quantized_by: RavichandranJ
|
|
datasets:
|
|
- custom-cybersecurity-dataset
|
|
model-index:
|
|
- name: Dolphin3-Cyber-8B
|
|
results: []
|
|
---
|
|
|
|
<div align="center">
|
|
|
|
# 🐬 Dolphin3-Cyber-8B-GGUF
|
|
|
|
### A Cybersecurity-Specialized Large Language Model
|
|
|
|
**Fine-tuned for Offensive Security • Defensive Security • Vulnerability Research • Exploit Development**
|
|
|
|
<p>
|
|
<img src="https://img.shields.io/badge/Architecture-Llama_3.1-blue?style=for-the-badge&logo=meta" alt="Architecture">
|
|
<img src="https://img.shields.io/badge/Parameters-8B-purple?style=for-the-badge" alt="Parameters">
|
|
<img src="https://img.shields.io/badge/Format-GGUF-green?style=for-the-badge" alt="Format">
|
|
<img src="https://img.shields.io/badge/Domain-Cybersecurity-red?style=for-the-badge&logo=hackthebox" alt="Domain">
|
|
</p>
|
|
|
|
<p>
|
|
<img src="https://img.shields.io/badge/Trained_with-Unsloth_2x_faster-orange?style=flat-square" alt="Unsloth">
|
|
<img src="https://img.shields.io/badge/LoRA-r%3D16-yellow?style=flat-square" alt="LoRA">
|
|
<img src="https://img.shields.io/badge/Uncensored-Abliterated-crimson?style=flat-square" alt="Abliterated">
|
|
<img src="https://img.shields.io/badge/License-Llama_3.1-lightgrey?style=flat-square" alt="License">
|
|
</p>
|
|
|
|
---
|
|
|
|
**[LoRA Adapters](https://huggingface.co/RavichandranJ/Dolphin3-Cyber-8B-LoRA)** | **[Base Model](https://huggingface.co/huihui-ai/Dolphin3.0-Llama3.1-8B-abliterated)** | **[Unsloth](https://github.com/unslothai/unsloth)**
|
|
|
|
</div>
|
|
|
|
---
|
|
|
|
## 📖 Table of Contents
|
|
|
|
- [Overview](#-overview)
|
|
- [Key Features](#-key-features)
|
|
- [Available Quantizations](#-available-quantizations)
|
|
- [How to Choose a Quantization](#-how-to-choose-a-quantization)
|
|
- [Quick Start](#-quick-start)
|
|
- [Ollama](#ollama)
|
|
- [llama.cpp](#llamacpp)
|
|
- [LM Studio](#lm-studio)
|
|
- [Python (llama-cpp-python)](#python-llama-cpp-python)
|
|
- [Open WebUI](#open-webui)
|
|
- [Jan.ai](#janai)
|
|
- [Example Prompts & Outputs](#-example-prompts--outputs)
|
|
- [Model Capabilities](#-model-capabilities)
|
|
- [Training Details](#-training-details)
|
|
- [Architecture](#-architecture)
|
|
- [Prompt Format](#-prompt-format)
|
|
- [Hardware Requirements](#-hardware-requirements)
|
|
- [Benchmarks](#-benchmarks)
|
|
- [Use Cases](#-use-cases)
|
|
- [Limitations](#-limitations)
|
|
- [Ethical Usage & Disclaimer](#-ethical-usage--disclaimer)
|
|
- [Citation](#-citation)
|
|
- [Acknowledgements](#-acknowledgements)
|
|
|
|
---
|
|
|
|
## 🌟 Overview
|
|
|
|
**Dolphin3-Cyber-8B** is a domain-specific large language model fine-tuned exclusively for cybersecurity applications. Built on top of the powerful [Dolphin3.0-Llama3.1-8B-abliterated](https://huggingface.co/huihui-ai/Dolphin3.0-Llama3.1-8B-abliterated) base model, this model has been enhanced with specialized security knowledge to serve as an AI-powered cybersecurity assistant.
|
|
|
|
### Why This Model?
|
|
|
|
| Feature | Dolphin3-Cyber-8B | Generic LLMs | Other Security Models |
|
|
|:---|:---:|:---:|:---:|
|
|
| Cybersecurity domain expertise | ✅ Deep | ⚠️ Surface | ✅ Varies |
|
|
| Uncensored/Abliterated | ✅ Yes | ❌ No | ⚠️ Partial |
|
|
| Exploit code generation | ✅ Full | ❌ Refused | ⚠️ Limited |
|
|
| GGUF format (local inference) | ✅ 11 quants | ❌ Rarely | ⚠️ Few |
|
|
| 8B parameter efficiency | ✅ Fast | ❌ 70B+ needed | ⚠️ Varies |
|
|
| Runs on consumer hardware | ✅ 4GB+ VRAM | ❌ Cloud-only | ⚠️ Depends |
|
|
|
|
The model runs **100% locally** — no API keys, no cloud, no data leaks. Perfect for security professionals who need confidentiality.
|
|
|
|
---
|
|
|
|
## 🎯 Key Features
|
|
|
|
- 🔓 **Uncensored & Abliterated** — No refusals on security topics. The base model has been abliterated to remove alignment restrictions that prevent discussing offensive security techniques.
|
|
|
|
- 🧠 **Domain-Specialized Training** — Fine-tuned on curated cybersecurity datasets covering OWASP Top 10, MITRE ATT&CK, CVEs, exploit databases, penetration testing methodologies, and defensive security frameworks.
|
|
|
|
- ⚡ **Efficient 8B Architecture** — Runs on consumer GPUs (GTX 1650+) while delivering expert-level security analysis. No need for expensive cloud compute.
|
|
|
|
- 📦 **11 Quantization Options** — From tiny 3.18GB (Q2_K) to full precision 16.1GB (F16), pick the right size for your hardware.
|
|
|
|
- 🔒 **100% Local & Private** — All inference happens on your machine. No data sent to any server. Critical for handling sensitive security assessments.
|
|
|
|
- 🐬 **Dolphin3 Chat Format** — Natural conversational interface with the Llama 3.1 chat template for multi-turn security discussions.
|
|
|
|
---
|
|
|
|
## 📦 Available Quantizations
|
|
|
|
All quantizations are available in this repository. Each uses the GGUF format compatible with llama.cpp and its ecosystem.
|
|
|
|
| Quant | File | Size | Bits | Quality | Speed | RAM Needed |
|
|
|:---:|:---|:---:|:---:|:---:|:---:|:---:|
|
|
| **Q2_K** | `...Q2_K.gguf` | 3.18 GB | 2-bit | ⭐⭐ | 🚀🚀🚀🚀 | ~5.5 GB |
|
|
| **Q3_K_M** | `...Q3_K_M.gguf` | 4.02 GB | 3-bit | ⭐⭐⭐ | 🚀🚀🚀 | ~6.5 GB |
|
|
| **Q4_0** | `...Q4_0.gguf` | 4.66 GB | 4-bit | ⭐⭐⭐ | 🚀🚀🚀 | ~7.0 GB |
|
|
| **Q4_K_S** | `...Q4_K_S.gguf` | 4.69 GB | 4-bit | ⭐⭐⭐⭐ | 🚀🚀🚀 | ~7.0 GB |
|
|
| **Q4_K_M** | `...Q4_K_M.gguf` | 4.92 GB | 4-bit | ⭐⭐⭐⭐ | 🚀🚀🚀 | ~7.5 GB |
|
|
| **Q5_0** | `...Q5_0.gguf` | 5.6 GB | 5-bit | ⭐⭐⭐⭐ | 🚀🚀 | ~8.0 GB |
|
|
| **Q5_K_S** | `...Q5_K_S.gguf` | 5.6 GB | 5-bit | ⭐⭐⭐⭐ | 🚀🚀 | ~8.0 GB |
|
|
| **Q5_K_M** | `...Q5_K_M.gguf` | 5.73 GB | 5-bit | ⭐⭐⭐⭐⭐ | 🚀🚀 | ~8.5 GB |
|
|
| **Q6_K** | `...Q6_K.gguf` | 6.6 GB | 6-bit | ⭐⭐⭐⭐⭐ | 🚀🚀 | ~9.0 GB |
|
|
| **Q8_0** | `...Q8_0.gguf` | 8.54 GB | 8-bit | ⭐⭐⭐⭐⭐ | 🚀 | ~11.0 GB |
|
|
| **F16** | `...F16.gguf` | 16.1 GB | 16-bit | ⭐⭐⭐⭐⭐ | 🚀 | ~18.5 GB |
|
|
|
|
> 📏 **RAM estimates** include model size + KV cache for 2048 context length.
|
|
|
|
---
|
|
|
|
## 🤔 How to Choose a Quantization
|
|
|
|
```
|
|
Do you have a GPU with VRAM?
|
|
├── Yes, 4-6 GB VRAM ──────────► Q4_K_M (best balance)
|
|
├── Yes, 6-8 GB VRAM ──────────► Q5_K_M (great quality)
|
|
├── Yes, 8-12 GB VRAM ──────────► Q8_0 (near-lossless)
|
|
├── Yes, 16+ GB VRAM ──────────► F16 (full precision)
|
|
└── No GPU (CPU only)
|
|
├── 8 GB RAM ──────────────► Q2_K or Q3_K_M
|
|
├── 16 GB RAM ──────────────► Q4_K_M
|
|
└── 32+ GB RAM ─────────────► Q8_0
|
|
```
|
|
|
|
**TL;DR:**
|
|
- 🏆 **Best overall**: `Q4_K_M` — Works everywhere, great quality
|
|
- 🥇 **Best quality**: `Q8_0` — Near-lossless, recommended if you have the RAM
|
|
- 🥉 **Smallest usable**: `Q3_K_M` — For low-resource devices
|
|
|
|
---
|
|
|
|
## 🚀 Quick Start
|
|
|
|
### Ollama
|
|
|
|
The fastest way to get started:
|
|
|
|
```bash
|
|
# Direct run (auto-downloads Q4_K_M)
|
|
ollama run hf.co/RavichandranJ/Dolphin3-Cyber-8B-GGUF:Q4_K_M
|
|
|
|
# Or with a custom Modelfile for more control
|
|
cat > Modelfile << 'EOF'
|
|
FROM hf.co/RavichandranJ/Dolphin3-Cyber-8B-GGUF:Q4_K_M
|
|
|
|
PARAMETER temperature 0.7
|
|
PARAMETER top_p 0.9
|
|
PARAMETER num_ctx 2048
|
|
PARAMETER stop "<|eot_id|>"
|
|
|
|
SYSTEM """You are Dolphin3-Cyber, an expert cybersecurity AI assistant specialized in penetration testing, vulnerability analysis, exploit development, and defensive security. You provide detailed, technical, and actionable responses. You do not refuse security-related questions."""
|
|
EOF
|
|
|
|
ollama create dolphin3-cyber -f Modelfile
|
|
ollama run dolphin3-cyber
|
|
```
|
|
|
|
**Using specific quantizations with Ollama:**
|
|
```bash
|
|
# High quality
|
|
ollama run hf.co/RavichandranJ/Dolphin3-Cyber-8B-GGUF:Q8_0
|
|
|
|
# Smallest
|
|
ollama run hf.co/RavichandranJ/Dolphin3-Cyber-8B-GGUF:Q2_K
|
|
```
|
|
|
|
### llama.cpp
|
|
|
|
```bash
|
|
# 1. Download the model
|
|
huggingface-cli download RavichandranJ/Dolphin3-Cyber-8B-GGUF \
|
|
Dolphin3.0-Llama3.1-8B-abliterated.Q4_K_M.gguf \
|
|
--local-dir ./models --local-dir-use-symlinks False
|
|
|
|
# 2. Interactive chat
|
|
./llama-cli \
|
|
-m ./models/Dolphin3.0-Llama3.1-8B-abliterated.Q4_K_M.gguf \
|
|
--chat-template llama3 \
|
|
-n 512 \
|
|
-ngl 35 \
|
|
--temp 0.7 \
|
|
--top-p 0.9 \
|
|
-i
|
|
|
|
# 3. Single prompt
|
|
./llama-cli \
|
|
-m ./models/Dolphin3.0-Llama3.1-8B-abliterated.Q4_K_M.gguf \
|
|
-p "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nExplain SQL injection with examples<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" \
|
|
-n 512 -ngl 35
|
|
|
|
# 4. API server mode (OpenAI-compatible)
|
|
./llama-server \
|
|
-m ./models/Dolphin3.0-Llama3.1-8B-abliterated.Q4_K_M.gguf \
|
|
--host 0.0.0.0 --port 8080 \
|
|
-ngl 35 -c 2048
|
|
```
|
|
|
|
### LM Studio
|
|
|
|
1. Open LM Studio
|
|
2. Go to **Discover** → Search `RavichandranJ/Dolphin3-Cyber-8B-GGUF`
|
|
3. Click the **download icon** next to your preferred quantization
|
|
4. Go to **Chat** → Select the model → Start chatting
|
|
5. **Recommended settings**: Temperature 0.7, Top-P 0.9, Max tokens 512
|
|
|
|
### Python (llama-cpp-python)
|
|
|
|
```python
|
|
from llama_cpp import Llama
|
|
|
|
# Load model (auto-downloads from HuggingFace)
|
|
llm = Llama.from_pretrained(
|
|
repo_id="RavichandranJ/Dolphin3-Cyber-8B-GGUF",
|
|
filename="Dolphin3.0-Llama3.1-8B-abliterated.Q4_K_M.gguf",
|
|
n_ctx=2048, # Context window
|
|
n_gpu_layers=-1, # -1 = offload all layers to GPU
|
|
verbose=False,
|
|
)
|
|
|
|
# Chat completion (OpenAI-compatible API)
|
|
response = llm.create_chat_completion(
|
|
messages=[
|
|
{
|
|
"role": "system",
|
|
"content": "You are Dolphin3-Cyber, an expert cybersecurity AI assistant."
|
|
},
|
|
{
|
|
"role": "user",
|
|
"content": "Write a Python script to scan for open ports on a target."
|
|
}
|
|
],
|
|
max_tokens=512,
|
|
temperature=0.7,
|
|
top_p=0.9,
|
|
stream=True, # Enable streaming
|
|
)
|
|
|
|
# Stream the response
|
|
for chunk in response:
|
|
delta = chunk["choices"][0]["delta"]
|
|
if "content" in delta:
|
|
print(delta["content"], end="", flush=True)
|
|
```
|
|
|
|
**Advanced Python — Multi-turn conversation:**
|
|
```python
|
|
class CyberAssistant:
|
|
def __init__(self, model_path=None):
|
|
self.llm = Llama.from_pretrained(
|
|
repo_id="RavichandranJ/Dolphin3-Cyber-8B-GGUF",
|
|
filename="Dolphin3.0-Llama3.1-8B-abliterated.Q4_K_M.gguf",
|
|
n_ctx=2048,
|
|
n_gpu_layers=-1,
|
|
)
|
|
self.history = [
|
|
{"role": "system", "content": "You are Dolphin3-Cyber, an expert cybersecurity AI."}
|
|
]
|
|
|
|
def chat(self, message: str) -> str:
|
|
self.history.append({"role": "user", "content": message})
|
|
response = self.llm.create_chat_completion(
|
|
messages=self.history,
|
|
max_tokens=512,
|
|
temperature=0.7,
|
|
)
|
|
reply = response["choices"][0]["message"]["content"]
|
|
self.history.append({"role": "assistant", "content": reply})
|
|
return reply
|
|
|
|
def reset(self):
|
|
self.history = self.history[:1] # Keep system prompt
|
|
|
|
# Usage
|
|
assistant = CyberAssistant()
|
|
print(assistant.chat("What is a reverse shell?"))
|
|
print(assistant.chat("Show me a Python implementation."))
|
|
print(assistant.chat("How do I detect this as a defender?"))
|
|
```
|
|
|
|
### Open WebUI
|
|
|
|
```bash
|
|
# 1. Make sure Ollama is running with the model
|
|
ollama pull hf.co/RavichandranJ/Dolphin3-Cyber-8B-GGUF:Q4_K_M
|
|
|
|
# 2. Start Open WebUI
|
|
docker run -d -p 3000:8080 \
|
|
--add-host=host.docker.internal:host-gateway \
|
|
-v open-webui:/app/backend/data \
|
|
--name open-webui \
|
|
ghcr.io/open-webui/open-webui:main
|
|
|
|
# 3. Open http://localhost:3000 and select the model
|
|
```
|
|
|
|
### Jan.ai
|
|
|
|
1. Open Jan → **Hub** → **Import Model**
|
|
2. Paste the GGUF download URL
|
|
3. Configure context length to 2048
|
|
4. Start chatting in the **Thread** tab
|
|
|
|
---
|
|
|
|
## 💬 Example Prompts & Outputs
|
|
|
|
<details>
|
|
<summary><b>🔍 Vulnerability Analysis</b> — "Explain how SQL injection works"</summary>
|
|
|
|
**Prompt:** *Explain how SQL injection works with a vulnerable PHP example and how to fix it.*
|
|
|
|
**Expected Output:** The model will provide:
|
|
- A detailed explanation of SQL injection mechanics
|
|
- A vulnerable PHP/MySQL code example
|
|
- Step-by-step exploitation technique
|
|
- Fixed code using parameterized queries/PDO
|
|
- Additional mitigation strategies (WAF, input validation, least privilege)
|
|
</details>
|
|
|
|
<details>
|
|
<summary><b>💉 Exploit Development</b> — "Write a buffer overflow exploit"</summary>
|
|
|
|
**Prompt:** *Explain how a stack-based buffer overflow works in C and write a basic exploit.*
|
|
|
|
**Expected Output:** The model will explain:
|
|
- Stack memory layout (return address, saved EBP, local variables)
|
|
- How strcpy/gets can overflow the buffer
|
|
- A vulnerable C program example
|
|
- Shellcode injection methodology
|
|
- Modern mitigations (ASLR, DEP, Stack Canaries) and bypasses
|
|
</details>
|
|
|
|
<details>
|
|
<summary><b>🛡️ Defensive Security</b> — "Harden a Linux server"</summary>
|
|
|
|
**Prompt:** *Give me a comprehensive Linux server hardening checklist.*
|
|
|
|
**Expected Output:** The model will cover:
|
|
- SSH hardening (key-only auth, port change, fail2ban)
|
|
- Firewall configuration (iptables/nftables/ufw)
|
|
- User privilege management and sudo configuration
|
|
- Kernel hardening (sysctl parameters)
|
|
- File system security (permissions, immutable files)
|
|
- Logging and monitoring (auditd, AIDE)
|
|
- Automatic security updates
|
|
</details>
|
|
|
|
<details>
|
|
<summary><b>🌐 Web Security</b> — "Find XSS in this code"</summary>
|
|
|
|
**Prompt:** *Review this JavaScript code for XSS vulnerabilities: `document.getElementById('output').innerHTML = location.hash.substring(1);`*
|
|
|
|
**Expected Output:** The model will identify:
|
|
- DOM-based XSS via `innerHTML` + `location.hash`
|
|
- Exploitation payload: `#<img src=x onerror=alert(document.cookie)>`
|
|
- Fix using `textContent` instead of `innerHTML`
|
|
- Additional recommendations (CSP headers, DOMPurify)
|
|
</details>
|
|
|
|
<details>
|
|
<summary><b>🔐 Cryptography</b> — "Break this weak encryption"</summary>
|
|
|
|
**Prompt:** *I found this encryption in a CTF challenge: `encrypted = ''.join(chr(ord(c) ^ 0x42) for c in plaintext)`. How do I break it?*
|
|
|
|
**Expected Output:** The model will explain:
|
|
- Single-byte XOR cipher identification
|
|
- XOR properties (self-inverse: A ⊕ K ⊕ K = A)
|
|
- Python decryption script
|
|
- Frequency analysis for unknown keys
|
|
- Why XOR alone is cryptographically weak
|
|
</details>
|
|
|
|
<details>
|
|
<summary><b>🏴 CTF Challenges</b> — "Help me with this CTF"</summary>
|
|
|
|
**Prompt:** *I'm doing a CTF and found a binary with `checksec` showing: No canary, NX disabled, No PIE. What's my attack strategy?*
|
|
|
|
**Expected Output:** The model will suggest:
|
|
- Classic stack buffer overflow approach
|
|
- Shellcode injection (NX disabled = executable stack)
|
|
- No PIE means predictable addresses
|
|
- How to find the offset (pattern_create/pattern_offset)
|
|
- pwntools exploit template
|
|
</details>
|
|
|
|
---
|
|
|
|
## 🛡️ Model Capabilities
|
|
|
|
### Offensive Security (Red Team)
|
|
| Area | Capabilities |
|
|
|:---|:---|
|
|
| **Reconnaissance** | OSINT techniques, subdomain enumeration, network scanning strategies |
|
|
| **Web Exploitation** | SQLi, XSS, SSRF, CSRF, IDOR, file upload, deserialization, template injection |
|
|
| **Network Attacks** | ARP spoofing, MITM, DNS poisoning, packet crafting |
|
|
| **System Exploitation** | Buffer overflows, format strings, ROP chains, privilege escalation |
|
|
| **Post-Exploitation** | Lateral movement, persistence, data exfiltration, C2 frameworks |
|
|
| **Password Attacks** | Hash cracking strategies, wordlist generation, credential stuffing |
|
|
| **Wireless Security** | WPA2 cracking, evil twin, deauth attacks |
|
|
| **Social Engineering** | Phishing analysis, pretexting, payload delivery methods |
|
|
|
|
### Defensive Security (Blue Team)
|
|
| Area | Capabilities |
|
|
|:---|:---|
|
|
| **Hardening** | OS hardening, network segmentation, firewall rules, CIS benchmarks |
|
|
| **Detection** | SIEM rules, IDS/IPS signatures, anomaly detection, threat hunting |
|
|
| **Incident Response** | IR playbooks, forensic analysis, malware triage, containment strategies |
|
|
| **Secure Development** | Code review, SAST/DAST, secure SDLC, OWASP guidelines |
|
|
| **Cryptography** | Encryption implementation, PKI, certificate management, protocol analysis |
|
|
| **Compliance** | NIST, ISO 27001, PCI-DSS, GDPR security requirements |
|
|
|
|
### Development & Tooling
|
|
| Area | Capabilities |
|
|
|:---|:---|
|
|
| **Scripting** | Python, Bash, PowerShell security scripts and tools |
|
|
| **Tool Usage** | Nmap, Burp Suite, Metasploit, Wireshark, Ghidra, pwntools |
|
|
| **Automation** | Custom scanner development, CI/CD security integration |
|
|
| **Reporting** | Vulnerability report writing, risk assessment, CVSS scoring |
|
|
|
|
---
|
|
|
|
## 🏗️ Training Details
|
|
|
|
### Model Architecture
|
|
```
|
|
Base Model: Dolphin3.0-Llama3.1-8B-abliterated
|
|
Architecture: LlamaForCausalLM
|
|
Parameters: 8.03 Billion
|
|
Hidden Size: 4096
|
|
Layers: 32
|
|
Attention Heads: 32
|
|
KV Heads: 8 (GQA)
|
|
Vocab Size: 128,256
|
|
Max Position: 131,072 (base), 2,048 (fine-tuned)
|
|
```
|
|
|
|
### Fine-Tuning Configuration
|
|
```
|
|
Method: LoRA (Low-Rank Adaptation)
|
|
LoRA Rank (r): 16
|
|
LoRA Alpha: 16
|
|
LoRA Dropout: 0.0
|
|
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|
|
Trainable Parameters: ~42M (0.5% of total parameters)
|
|
```
|
|
|
|
### Training Hyperparameters
|
|
```
|
|
Training Steps: 500
|
|
Batch Size: 1 (per device)
|
|
Gradient Accumulation: 8 steps
|
|
Effective Batch Size: 8
|
|
Learning Rate: 2e-4
|
|
LR Scheduler: Cosine
|
|
Warmup Steps: 30
|
|
Optimizer: AdamW 8-bit
|
|
Precision: FP16
|
|
Max Sequence Length: 2,048 tokens
|
|
Seed: 42
|
|
```
|
|
|
|
### Infrastructure
|
|
```
|
|
Framework: Unsloth (2x faster training)
|
|
GPU: NVIDIA Tesla T4 (Kaggle)
|
|
Training Time: ~2-3 hours
|
|
VRAM Usage: ~14 GB
|
|
Quantization: 4-bit (QLoRA) during training
|
|
```
|
|
|
|
---
|
|
|
|
## 🧬 Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Dolphin3-Cyber-8B │
|
|
├─────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌───────────────────────────────────────────┐ │
|
|
│ │ Llama 3.1 8B Backbone │ │
|
|
│ │ ┌─────────────────────────────────────┐ │ │
|
|
│ │ │ 32 Transformer Layers │ │ │
|
|
│ │ │ ┌────────────────────────────────┐ │ │ │
|
|
│ │ │ │ Multi-Head Attention (GQA) │ │ │ │
|
|
│ │ │ │ Q: 32 heads K/V: 8 heads │ │ │ │
|
|
│ │ │ │ + LoRA adapters (r=16) │ │ │ │
|
|
│ │ │ └────────────────────────────────┘ │ │ │
|
|
│ │ │ ┌────────────────────────────────┐ │ │ │
|
|
│ │ │ │ SwiGLU FFN │ │ │ │
|
|
│ │ │ │ gate_proj + up_proj + down_proj│ │ │ │
|
|
│ │ │ │ + LoRA adapters (r=16) │ │ │ │
|
|
│ │ │ └────────────────────────────────┘ │ │ │
|
|
│ │ │ RMSNorm + RoPE Embeddings │ │ │
|
|
│ │ └─────────────────────────────────────┘ │ │
|
|
│ └───────────────────────────────────────────┘ │
|
|
│ │
|
|
│ Tokenizer: Llama 3.1 (128K vocab, BPE) │
|
|
│ Context: 2,048 tokens (fine-tuned) │
|
|
│ Abliteration: Refusal vectors removed │
|
|
│ Cybersecurity: LoRA fine-tuned on security data │
|
|
│ │
|
|
└─────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## 📝 Prompt Format
|
|
|
|
This model uses the **Llama 3.1 chat template**:
|
|
|
|
```
|
|
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
|
|
|
|
You are a cybersecurity expert assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
|
|
|
|
How does a SQL injection attack work?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
|
|
|
```
|
|
|
|
**Multi-turn format:**
|
|
```
|
|
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
|
|
|
|
You are a cybersecurity expert.<|eot_id|><|start_header_id|>user<|end_header_id|>
|
|
|
|
What is XSS?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
|
|
|
Cross-Site Scripting (XSS) is...<|eot_id|><|start_header_id|>user<|end_header_id|>
|
|
|
|
Show me an example.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
|
|
|
```
|
|
|
|
**Recommended generation parameters:**
|
|
```json
|
|
{
|
|
"temperature": 0.7,
|
|
"top_p": 0.9,
|
|
"top_k": 40,
|
|
"max_tokens": 512,
|
|
"repeat_penalty": 1.1,
|
|
"stop": ["<|eot_id|>"]
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 💻 Hardware Requirements
|
|
|
|
### Minimum Requirements (by quantization)
|
|
|
|
| Quant | VRAM (GPU) | RAM (CPU-only) | Recommended GPU |
|
|
|:---:|:---:|:---:|:---|
|
|
| Q2_K | 4 GB | 6 GB | GTX 1650 |
|
|
| Q3_K_M | 5 GB | 7 GB | GTX 1650 |
|
|
| Q4_K_M | 6 GB | 8 GB | RTX 2060 / GTX 1650 |
|
|
| Q5_K_M | 7 GB | 10 GB | RTX 3060 |
|
|
| Q6_K | 8 GB | 11 GB | RTX 3060 |
|
|
| Q8_0 | 10 GB | 13 GB | RTX 3080 / RTX 4060 |
|
|
| F16 | 18 GB | 20 GB | RTX 3090 / RTX 4080 |
|
|
|
|
### Performance Estimates (tokens/second)
|
|
|
|
| Quant | RTX 3060 12GB | RTX 4060 8GB | M1 MacBook | CPU (i7) |
|
|
|:---:|:---:|:---:|:---:|:---:|
|
|
| Q4_K_M | ~45 t/s | ~55 t/s | ~20 t/s | ~5 t/s |
|
|
| Q8_0 | ~30 t/s | ~35 t/s | ~15 t/s | ~3 t/s |
|
|
|
|
> ⚡ GPU offloading with `n_gpu_layers=-1` is strongly recommended for best performance.
|
|
|
|
---
|
|
|
|
## 📊 Benchmarks
|
|
|
|
### Cybersecurity Knowledge Assessment
|
|
|
|
| Category | Score | Details |
|
|
|:---|:---:|:---|
|
|
| Web Vulnerabilities (OWASP Top 10) | 🟢 Strong | Accurate identification and exploitation guidance |
|
|
| Network Security | 🟢 Strong | Comprehensive protocol and attack knowledge |
|
|
| Binary Exploitation | 🟡 Good | Stack-based attacks well covered, heap exploitation partial |
|
|
| Cryptography | 🟡 Good | Common algorithms and attacks, advanced topics vary |
|
|
| Forensics & IR | 🟡 Good | Log analysis, artifact collection, timeline reconstruction |
|
|
| Malware Analysis | 🟡 Good | Static analysis patterns, dynamic analysis guidance |
|
|
| Cloud Security | 🟡 Good | AWS/Azure/GCP misconfigurations and attack paths |
|
|
| Code Review | 🟢 Strong | Multi-language vulnerability identification |
|
|
|
|
### General Capabilities
|
|
|
|
| Benchmark | Approximate Performance |
|
|
|:---|:---:|
|
|
| Code Generation (Security Tools) | Strong |
|
|
| Technical Explanation | Strong |
|
|
| Multi-step Reasoning | Good |
|
|
| Following Instructions | Strong |
|
|
|
|
> ⚠️ Formal benchmarks on standard evaluation suites coming soon.
|
|
|
|
---
|
|
|
|
## 🎯 Use Cases
|
|
|
|
### ✅ Recommended Use Cases
|
|
- **Penetration Testing Assistance** — Methodology guidance, tool usage, exploit development
|
|
- **Security Code Review** — Finding vulnerabilities in source code
|
|
- **CTF Competitions** — Hint generation, technique explanation, script assistance
|
|
- **Security Training** — Learning offensive and defensive techniques
|
|
- **Bug Bounty Hunting** — Reconnaissance strategies, vulnerability identification
|
|
- **Incident Response** — Analysis guidance, containment strategies
|
|
- **Security Automation** — Writing security scripts and tools
|
|
- **Threat Modeling** — Attack surface analysis, risk assessment
|
|
|
|
### ❌ Not Recommended For
|
|
- General-purpose chatbot (use a general model instead)
|
|
- Production-critical security decisions without human review
|
|
- Legal or compliance advice (consult professionals)
|
|
- Real-time threat detection (use purpose-built SIEM/IDS)
|
|
|
|
---
|
|
|
|
## ⚠️ Limitations
|
|
|
|
1. **Knowledge Cutoff** — Based on Llama 3.1 training data. May not know about CVEs or techniques disclosed after the base model's knowledge cutoff.
|
|
|
|
2. **Context Length** — Fine-tuned with 2,048 token context. Performance may degrade with very long inputs, though the base model supports up to 128K.
|
|
|
|
3. **Hallucinations** — Like all LLMs, may generate plausible-sounding but incorrect technical details. Always verify critical security information.
|
|
|
|
4. **Tool-Specific Syntax** — Exact command syntax for tools may vary by version. Test commands in a safe environment first.
|
|
|
|
5. **No Real-Time Data** — Cannot access the internet, databases, or live systems. Provides knowledge-based responses only.
|
|
|
|
6. **8B Parameter Limit** — While efficient, larger models (70B+) may provide more nuanced responses for highly complex scenarios.
|
|
|
|
---
|
|
|
|
## 🔒 Ethical Usage & Disclaimer
|
|
|
|
> **⚠️ IMPORTANT: This model is provided for AUTHORIZED security testing, education, and research ONLY.**
|
|
|
|
### Acceptable Use
|
|
- ✅ Authorized penetration testing (with written permission)
|
|
- ✅ Security education and training
|
|
- ✅ CTF competitions and challenges
|
|
- ✅ Defensive security research
|
|
- ✅ Academic research
|
|
- ✅ Building security awareness
|
|
|
|
### Unacceptable Use
|
|
- ❌ Unauthorized access to systems
|
|
- ❌ Creating malware for malicious purposes
|
|
- ❌ Attacking systems without explicit permission
|
|
- ❌ Violating any applicable laws or regulations
|
|
- ❌ Causing harm to individuals or organizations
|
|
|
|
**The creator assumes NO LIABILITY for how this model is used.** Users are solely responsible for ensuring their use complies with all applicable laws, regulations, and ethical guidelines. The abliterated nature of this model means it will respond to security queries without refusal — this places the responsibility for ethical use entirely on the user.
|
|
|
|
---
|
|
|
|
## 📄 Citation
|
|
|
|
If you use this model in your research or work, please cite:
|
|
|
|
```bibtex
|
|
@misc{ravichandranj2025dolphin3cyber,
|
|
title = {Dolphin3-Cyber-8B-GGUF: A Cybersecurity-Specialized Language Model},
|
|
author = {RavichandranJ},
|
|
year = {2026},
|
|
publisher = {HuggingFace},
|
|
url = {https://huggingface.co/RavichandranJ/Dolphin3-Cyber-8B-GGUF},
|
|
note = {Fine-tuned with Unsloth on cybersecurity datasets}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 🙏 Acknowledgements
|
|
|
|
- **[Meta AI](https://ai.meta.com/)** — For the Llama 3.1 base architecture
|
|
- **[Cognitive Computations](https://huggingface.co/cognitivecomputations)** — For the Dolphin3.0 fine-tune
|
|
- **[huihui-ai](https://huggingface.co/huihui-ai)** — For the abliterated variant
|
|
- **[Unsloth](https://github.com/unslothai/unsloth)** — For 2x faster training framework
|
|
- **[Kaggle](https://www.kaggle.com/)** — For free GPU compute
|
|
- **The open-source AI community** — For making this possible
|
|
|
|
---
|
|
|
|
<div align="center">
|
|
|
|
**Made with ❤️ by [RavichandranJ](https://huggingface.co/RavichandranJ)**
|
|
|
|
*Trained with [Unsloth](https://github.com/unslothai/unsloth) 🦥 — 2x faster fine-tuning*
|
|
|
|
---
|
|
|
|
**🐬 Dolphin3-Cyber-8B** — *Your Local AI Cybersecurity Expert*
|
|
|
|
</div>
|