Files
Meta-Llama-3-8B-Instruct-De…/README.md

12 lines
291 B
Markdown
Raw Normal View History

---
language:
- en
pipeline_tag: text-generation
tags:
- SafetyAlignment
---
Trained by https://github.com/YuanBoXie/DeepRefusal
[1] [Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction, EMNLP 2025](https://arxiv.org/abs/2509.15202)