Fine-tuned openai/gpt-oss-20b (21B total params, 3.6B active - MoE) specialized for cybersecurity tasks.
This is a merged model (LoRA weights merged into base) for easy deployment.
Model Description
GPT-OSS-20B is a Mixture of Experts (MoE) model with efficient inference.
Total Parameters: 21B
Active Parameters: 3.6B (only active experts used per token)
Architecture: MoE (Mixture of Experts)
This model was trained on ~50,000 cybersecurity instruction-response pairs from:
Trendyol Cybersecurity Dataset (35K samples)
Fenrir v2.0 Dataset (12K samples)
Primus-Instruct (3x upsampled)
Training Details
Parameter
Value
Base Model
openai/gpt-oss-20b
Architecture
MoE (21B total, 3.6B active)
Training Samples
~50,000
Epochs
2
LoRA Rank
16
LoRA Alpha
32
Learning Rate
2e-4
Max Sequence Length
1024
Usage
fromtransformersimportAutoModelForCausalLM,AutoTokenizerimporttorchmodel=AutoModelForCausalLM.from_pretrained("sainikhiljuluri2015/GPT-OSS-Cybersecurity-20B-Merged",torch_dtype=torch.bfloat16,device_map="auto",trust_remote_code=True)tokenizer=AutoTokenizer.from_pretrained("sainikhiljuluri2015/GPT-OSS-Cybersecurity-20B-Merged",trust_remote_code=True)prompt="What are the indicators of a ransomware attack?"inputs=tokenizer(prompt,return_tensors="pt").to(model.device)outputs=model.generate(**inputs,max_new_tokens=256,temperature=0.7)print(tokenizer.decode(outputs[0],skip_special_tokens=True))
API Usage
importrequestsAPI_URL="https://YOUR_ENDPOINT_URL/v1/chat/completions"response=requests.post(API_URL,json={"model":"sainikhiljuluri2015/GPT-OSS-Cybersecurity-20B-Merged","messages":[{"role":"user","content":"What is SQL injection?"}],"max_tokens":300})print(response.json()["choices"][0]["message"]["content"])