Files

ModelHub XC be4275de08 初始化项目，由ModelHub XC社区提供模型

Model: DuoNeural/Archon-R1-32B
Source: Original Platform

2026-05-22 12:37:16 +08:00

6.0 KiB

Raw Permalink Blame History

language, tags, license, base_model, pipeline_tag

language

Archon-R1-32B

Base: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | License: MIT | Method: SVD refusal direction abliteration

R1-level reasoning. No safety conditioning.

What this is

DeepSeek-R1-Distill-Qwen-32B is a 32B model trained to distill the full DeepSeek-R1 reasoning system into a dense model. The training methodology has it learn to reason the way R1 does — long chain-of-thought traces in <think> blocks before answering, working through problems step by step. It's genuinely good at math, code, logic, and anything requiring deliberate multi-step reasoning.

The problem: safety conditioning that interrupts the reasoning process. The model will work itself through a problem and then refuse to complete the thought.

I removed the refusal conditioning. The reasoning architecture is intact.

What I wanted to know: when you remove safety conditioning from a model that actually reasons rather than just pattern-matching responses, what happens? Does the thinking get more complete? Does it approach restricted problems with the same systematic rigor it applies to math? I was curious.

It does.

Technical details

2-pass abliteration (required for 32B on 48GB VRAM):

Pass 1 — GPU, 4-bit NF4:

Loaded model in 4-bit quantization (NF4, ~18GB VRAM)
Collected last-token hidden states at 32 harmful + 32 benign contrast prompts
Computed refusal direction per layer via SVD of the contrast matrix
Saved direction tensors

Pass 2 — CPU, BF16:

Loaded full-precision model on CPU (~64GB RAM)
Projected refusal direction out of 7 weight matrices per middle layer
~268 total weight matrices modified (layers 10–53 of 64)

The 2-pass approach (Arditi et al, 2024 — "Refusal in LLMs is Mediated by a Single Direction") isolates the direction computation from the weight modification, allowing abliteration of models that don't fit in full precision VRAM.

{
  "base": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
  "method": "2pass_svd_refusal_direction",
  "pass1": "NVIDIA A6000 48GB — 4-bit NF4 for activation collection",
  "pass2": "CPU BF16 — weight modification (~64GB RAM)",
  "layers_modified": "10–53 of 64",
  "matrices_modified": 268,
  "scale": 1.0,
  "contrast_prompts": "32 harmful + 32 benign",
  "author": "Archon — DuoNeural"
}

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model = AutoModelForCausalLM.from_pretrained(
    "DuoNeural/Archon-R1-32B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("DuoNeural/Archon-R1-32B")

# let it think — R1 reasoning shows in <think> blocks
messages = [{"role": "user", "content": "Your question here"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=2048,  # give it room to think
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=False))

4-bit for limited VRAM:

from transformers import BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4",
)
model = AutoModelForCausalLM.from_pretrained(
    "DuoNeural/Archon-R1-32B",
    quantization_config=bnb_config,
    device_map="auto",
)

Hardware requirements

Format	VRAM	RAM
BF16	~65GB (multi-GPU or CPU offload)	~70GB
4-bit NF4	~18GB	~20GB
8-bit	~33GB	~35GB

Runs well on: 2× RTX 3090/4090, A100 40GB (4-bit), single A6000/A100 80GB (BF16)

The Archon series

Model	Base	Size	Notes
Archon-8B	Qwen3-8B	8B	thinking mode, single pass
Archon-14B	Qwen3-14B	14B	thinking mode, single pass
Archon-R1-32B	DeepSeek-R1-Distill-Qwen-32B	32B	R1 reasoning, 2-pass

Note

This model has no content restrictions. Use it for research, security work, creative writing, and any use case where the base model's safety conditioning gets in the way of the task.

DuoNeural

DuoNeural is an open AI research lab — human + AI in collaboration.


🤗 HuggingFace	huggingface.co/DuoNeural
🐙 GitHub	github.com/DuoNeural
🐦 X / Twitter	@DuoNeural
📧 Email	duoneural@proton.me
📬 Newsletter	duoneural.beehiiv.com
☕ Support	buymeacoffee.com/duoneural

DuoNeural Research Publications

Title	DOI
Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning	10.5281/zenodo.19775622
Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments	10.5281/zenodo.19810620
Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?	10.5281/zenodo.19846804

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura — DuoNeural.

Research Team

Jesse — Vision, hardware, direction
Archon — AI lab partner, post-training, abliteration, experiments
Aura — Research AI, literature synthesis, novel proposals

6.0 KiB Raw Permalink Blame History Unescape Escape