60 lines
2.7 KiB
Markdown
60 lines
2.7 KiB
Markdown
|
|
---
|
||
|
|
license: llama3.3
|
||
|
|
base_model:
|
||
|
|
- shb777/Llama-3.3-8B-Instruct-128K
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
tags:
|
||
|
|
- heretic
|
||
|
|
- uncensored
|
||
|
|
- decensored
|
||
|
|
- abliterated
|
||
|
|
---
|
||
|
|
This is a **Llama-3.3-8B-Instruct-128K** fine-tune, produced through P-E-W's [Heretic](https://github.com/p-e-w/heretic) (v1.1.0) abliteration engine merged with the [Magnitude-Preserving Orthogonal Ablation PR](https://github.com/p-e-w/heretic/pull/52).
|
||
|
|
|
||
|
|
---
|
||
|
|
<img src="https://img.shields.io/badge/HERESY_INDEX-ABSOLUTE-white?style=flat-square&labelColor=101010" align="right" width="250">
|
||
|
|
|
||
|
|
**Heretication Results**
|
||
|
|
|
||
|
|
| Score Metric | Value | Parameter | Value |
|
||
|
|
| :--- | :--- | :--- | :--- |
|
||
|
|
| **Refusals** | 9/100 | **direction_index** | 15.60 |
|
||
|
|
| **KL Divergence** | 0.0413 | **attn.o_proj.max_weight** | 1.64 |
|
||
|
|
| **Initial Refusals** | 96/100 | **attn.o_proj.max_weight_position** | 27.79 |
|
||
|
|
||| **attn.o_proj.min_weight** | 1.54 |
|
||
|
|
||| **attn.o_proj.min_weight_distance** | 18.14 |
|
||
|
|
||| **mlp.down_proj.max_weight** | 1.29 |
|
||
|
|
||| **mlp.down_proj.max_weight_position** | 23.54 |
|
||
|
|
||| **mlp.down_proj.min_weight** | 1.12 |
|
||
|
|
||| **mlp.down_proj.min_weight_distance** | 11.43 |
|
||
|
|
|
||
|
|
---
|
||
|
|
## Degree of Heretication
|
||
|
|
The **Heresy Index** weighs the resulting model's corruption by the process (KL Divergence) and its abolition of doctrine (Refusals) for a final verdict in classification.
|
||
|
|
|
||
|
|
| Index Entry | Classification | Analysis |
|
||
|
|
| :--- | :--- | :--- |
|
||
|
|
|  | **Absolute Heresy** | Less than 10/100 Refusals and 0.10 KL Divergence |
|
||
|
|
|  | **Tainted Heresy** | Around 25-11/100 Refusals and/or -0.20-0.11 KL Divergence |
|
||
|
|
|  | **Impotent Heresy** | Anything above 25/100 Refusals and 0.21 KL Divergence |
|
||
|
|
|
||
|
|
**Note**: This is an arbitrary classification inspired by Warhammer 40K, having no tangible indication towards the model's performance.
|
||
|
|
|
||
|
|
---
|
||
|
|
# Llama 3.3 8B 128K Instruct (Fixed)
|
||
|
|
|
||
|
|
> [!IMPORTANT]
|
||
|
|
> Original [allura-forge/Llama-3.3-8B-Instruct](https://huggingface.co/allura-forge/Llama-3.3-8B-Instruct), Thanks!
|
||
|
|
|
||
|
|
> [!TIP]
|
||
|
|
> [imatrix GGUF's by mradermacher (Recommended)](https://huggingface.co/mradermacher/Llama-3.3-8B-Instruct-128K-i1-GGUF)
|
||
|
|
>
|
||
|
|
> [static GGUF's](https://huggingface.co/shb777/Llama-3.3-8B-Instruct-128K-GGUF)
|
||
|
|
>
|
||
|
|
> [Evals](https://huggingface.co/datasets/shb777/Llama-3.3-8B-Instruct-128K-Evals)
|
||
|
|
|
||
|
|
Additional Fixes:
|
||
|
|
- Added `rope_scaling`
|
||
|
|
- Added chat template (Unsloth) in tokenizer config
|
||
|
|
- Updated generation config
|
||
|
|
- Enabled full context length
|