99 lines
3.7 KiB
Markdown
99 lines
3.7 KiB
Markdown
|
|
---
|
||
|
|
base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
|
||
|
|
tags:
|
||
|
|
- text-generation-inference
|
||
|
|
- transformers
|
||
|
|
- unsloth
|
||
|
|
- qwen2
|
||
|
|
- cybersecurity
|
||
|
|
- vulnerability-analysis
|
||
|
|
- exploit-code
|
||
|
|
license: apache-2.0
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
datasets:
|
||
|
|
- NVD/CVE
|
||
|
|
- exploitdb
|
||
|
|
- MITRE/CWE
|
||
|
|
metrics:
|
||
|
|
- perplexity
|
||
|
|
- rouge
|
||
|
|
- bleu
|
||
|
|
- meteor
|
||
|
|
- bertscore
|
||
|
|
---
|
||
|
|
|
||
|
|
# Qwen2.5-3B-CyberSec-Instruct
|
||
|
|
|
||
|
|
A fine-tuned version of `Qwen2.5-3B-Instruct` specifically designed for advanced cybersecurity analysis. This model is built to bridge the gap between high-level vulnerability descriptions and low-level exploit code execution.
|
||
|
|
|
||
|
|
- **Developed by:** Mohamedabul
|
||
|
|
- **License:** apache-2.0
|
||
|
|
- **Finetuned from model:** unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
|
||
|
|
- **Architecture:** 3B parameters (4-bit QLoRA)
|
||
|
|
|
||
|
|
This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
||
|
|
|
||
|
|
<!-- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) -->
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Model Capabilities
|
||
|
|
|
||
|
|
This LLM acts as an expert cybersecurity analyst and reverse engineer. It is capable of:
|
||
|
|
|
||
|
|
1. **Vulnerability Triage**: Automatically generating structured severity, attack vector, and mitigation reports for any CVE.
|
||
|
|
2. **Exploit Reverse-Engineering**: Analyzing raw exploit code (C, Python, Bash) to provide an immediate technical breakdown of *how* the exploit works and what vulnerabilities it targets.
|
||
|
|
3. **Attack Chain Reasoning**: Combining a CVE with raw exploit code to generate a step-by-step kill-chain analysis, from initial access to system compromise.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Training Data
|
||
|
|
|
||
|
|
To achieve maximum accuracy, the model was fine-tuned on an expansive historical corpus of modern vulnerabilities and exploits, completely uncapped and unfiltered:
|
||
|
|
- **NVD CVE Database**: Vulnerabilities published between 2020 through 2025.
|
||
|
|
- **Exploit-DB**: Over 45,000+ real-world exploits directly from Offensive Security.
|
||
|
|
- **MITRE CWE**: Full weakness classifications, likelihood of exploit, and abstractions.
|
||
|
|
- **Total Dataset Size**: ~187,700 structured instruction samples.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Evaluation Metrics
|
||
|
|
|
||
|
|
The fine-tuned model was evaluated against an unseen hold-out test dataset to mathematically verify its understanding of cybersecurity concepts and generation quality.
|
||
|
|
|
||
|
|
| Metric | Score | Interpretation |
|
||
|
|
|--------|-------|----------------|
|
||
|
|
| **Perplexity** | **7.61** | *Excellent.* Reflects high confidence and deep vocabulary retention for security concepts. |
|
||
|
|
| **METEOR** | **0.4084** | *Very Good.* The model captures semantic meaning effectively, correctly utilizing security synonyms. |
|
||
|
|
| **ROUGE-1** | **0.3496** | High structural and unigram overlap with security researcher standards. |
|
||
|
|
| **ROUGE-L** | **0.2044** | Consistent sentence-level alignment for technical vulnerability reports. |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Usage & Inference
|
||
|
|
|
||
|
|
To load the model quickly using Unsloth for 2x faster inference:
|
||
|
|
|
||
|
|
```python
|
||
|
|
from unsloth import FastLanguageModel
|
||
|
|
|
||
|
|
# Load the model directly from this Hugging Face repository
|
||
|
|
model, tokenizer = FastLanguageModel.from_pretrained(
|
||
|
|
model_name="Mohamedabul/Qwen2.5-3B-CyberSec-Instruct", # or your exact repo name
|
||
|
|
max_seq_length=1024,
|
||
|
|
dtype=None,
|
||
|
|
load_in_4bit=True,
|
||
|
|
)
|
||
|
|
FastLanguageModel.for_inference(model)
|
||
|
|
|
||
|
|
# Example Prompt
|
||
|
|
instruction = "Analyze this vulnerability: CVE-2021-44228 (Log4Shell). Provide attack vectors, severity, and mitigation."
|
||
|
|
prompt = tokenizer.apply_chat_template(
|
||
|
|
[{"role": "user", "content": instruction}], tokenize=False, add_generation_prompt=True
|
||
|
|
)
|
||
|
|
|
||
|
|
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
|
||
|
|
outputs = model.generate(**inputs, max_new_tokens=256, use_cache=True)
|
||
|
|
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
|