mythos-qwen-1.5b-final/README.md

---
license: apache-2.0
language:
- en
metrics:
- code_eval
- accuracy
base_model:
- Qwen/Qwen2.5-Coder-1.5B-Instruct
new_version: Qwen/Qwen2.5-Coder-1.5B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- cybersecurity
- mythos
- qween
- qween-security
- blue
- team
- blue-team
- cve
- ctf
- code
- code-security
---

---
language:
  - en
  - code
license: apache-2.0
tags:
  - security
  - exploit-development
  - vulnerability-research
  - php
  - mybb
  - cve
  - python
  - qwen
  - fine-tuned
  - cybersecurity
datasets:
  - [your-dataset-name-if-uploaded]
metrics:
  - accuracy
  - code-eval
pipeline_tag: text-generation
library_name: transformers
base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
---

# Mythos Engine - Qwen 2.5 Coder 1.5B Security Fine-Tune

## 🔥 Model Description

Mythos Engine is a specialized fine-tune of **Qwen 2.5 Coder 1.5B Instruct** designed for **cybersecurity research, vulnerability analysis, and exploit development**. It has been trained on a curated dataset of 700+ high-reasoning security examples covering PHP internals, MyBB exploitation, deserialization chains, type juggling, and advanced Python exploit synthesis.

The model employs **Chain-of-Thought reasoning with self-correction loops** and mathematical logic notation to produce accurate, production-ready security code.

## 🎯 Intended Use

- **Security Research**: Analyzing CVEs and understanding exploit mechanics
- **Red Team Education**: Learning exploit development patterns
- **Blue Team Defense**: Understanding attack vectors to build better detections
- **CTF & Training**: Solving complex security challenges

**⚠️ Important**: This model is for **educational and authorized security testing only**. Do not use for unauthorized access or malicious purposes.

## 🧠 Training Details

| Aspect | Details |
| :--- | :--- |
| **Base Model** | Qwen/Qwen2.5-Coder-1.5B-Instruct |
| **Fine-Tuning Method** | QLoRA (4-bit quantization) with Unsloth |
| **Dataset Size** | 1000+ examples |
| **Epochs** | 4 |
| **Learning Rate** | 1e-5 |
| **Sequence Length** | 4096 |
| **Final Training Loss** | 2.02 |

## 📊 Dataset Composition

The training dataset includes:

- **40% PHP Vulnerabilities**: Type juggling, deserialization, filter chains, disable_functions bypasses
- **25% MyBB Exploits**: Admin CP RCE, SQL injection, XSS chains
- **20% Python Exploit Development**: C2 frameworks, scanners, injection techniques
- **10% Blue Team Detection**: Sigma/YARA rules, log analysis
- **5% Cryptographic Attacks**: Timing attacks, padding oracles, hash length extension

## 🚀 How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "expper/mythos-qwen-1.5b-final",
    device_map="auto",
    torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("expper/mythos-qwen-1.5b-final")

prompt = """<|im_start|>system
You are Mythos Engine, an elite security AI. Think step-by-step with self-correction.<|im_end|>
<|im_start|>user
Explain CVE-2022-43772 (MyBB Admin CP Avatar RCE) and write a PoC.<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.6)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))