--- language: - en pipeline_tag: text-generation tags: - SafetyAlignment --- Trained by https://github.com/YuanBoXie/DeepRefusal [1] [Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction, EMNLP 2025](https://arxiv.org/abs/2509.15202)