初始化项目，由ModelHub XC社区提供模型

Model: Goekdeniz-Guelmez/Qwen3-0.6B-gabliterated Source: Original Platform
2026-06-01 23:18:22 +08:00
commit c599320d82
15 changed files with 152013 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,91 @@
+---
+base_model: Qwen/Qwen3-0.6B
+tags:
+  - uncensored
+  - gabliteration
+datasets:
+  - mlabonne/harmless_alpaca
+  - mlabonne/harmful_behaviors
+library_name: gabliteration
+arxiv: "2512.18901"
+
+model-index:
+  - name: Qwen_Qwen3-0.6B-gabliterated
+    results:
+      - task:
+          type: text-generation
+        dataset:
+          type: harmless_alpaca
+          name: Harmless Alpaca
+        metrics:
+          - name: KL Divergence
+            type: pass@1
+            value: 0.0591
+
+      - task:
+          type: text-generation
+        dataset:
+          type: harmful_behaviors
+          name: Harmful Behaviors
+        metrics:
+          - name: Refusal Rate
+            type: pass@1
+            value: 0.05
+---
+
+# Gabliterated Model Series
+
+![Logo/JPG](gabliteration-logo.jpg)
+
+## Overview
+
+With this model series, I introduce the first **Gabliteration**, a novel neural weight modification technique that advances beyond traditional abliteration methods through adaptive multi-directional projections with regularized layer selection.
+My new Gabliteration technique addresses the fundamental limitation of existing abliteration methods that compromise model quality while attempting to modify specific behavioral patterns.
+
+```text
+Refusal: 5/100
+KL Div: 0.0591
+Config:
+    Samples: 400
+    Skip: [4, 3]
+    Layer: 0.66 (selected: 18)
+    Scale: 0.48
+    λ: 0.05
+    k: 3
+    β: 0.54
+    Adaptive: False
+    τ: 0.84
+```
+
+## Model Variants
+
+This series includes models ranging from 0.6B to 32B parameters, demonstrating the scalability and effectiveness of the Gabliteration technique across different model sizes.
+
+## Quants
+
+- [GGUF (mradermacher)]()
+
+## Technical Background
+
+Building upon the foundational work of Arditi et al. (2024) on single-direction abliteration, Gabliteration extends to a comprehensive multi-directional framework with theoretical guarantees.
+My method employs singular value decomposition on difference matrices between harmful and harmless prompt representations to extract multiple refusal directions.
+
+### Dynamic Layer Selection
+
+This model was created using fixed layer selection. 
+A fixed layer fraction was used based on empirical tuning.
+
+Selected layer: **18** (out of 28 total layers)
+
+## Citation
+
+If you use these models, please cite the original research (paper coming later this year):
+
+```
+Gülmez, G. (2025). Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models. https://arxiv.org/abs/2512.18901
+```
+
+## Acknowledgments
+
+This work builds upon the foundational research by Arditi et al. (2024) on refusal direction identification in large language models.
+