63 lines
3.0 KiB
Markdown
63 lines
3.0 KiB
Markdown
|
|
---
|
||
|
|
license: other
|
||
|
|
license_name: deepseek
|
||
|
|
license_link: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/blob/main/LICENSE
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
|
||
|
|
tags:
|
||
|
|
- qwen2
|
||
|
|
- deepseek
|
||
|
|
- reasoning
|
||
|
|
- uncensored
|
||
|
|
- abliterated
|
||
|
|
- chain-of-thought
|
||
|
|
library_name: transformers
|
||
|
|
---
|
||
|
|
|
||
|
|
# DeepSeek-R1-Distill-Qwen-32B Uncensored
|
||
|
|
|
||
|
|
An **abliterated** (uncensored) version of [deepseek-ai/DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) — a 32B reasoning model with chain-of-thought capabilities, minus the safety refusals.
|
||
|
|
|
||
|
|
This combines DeepSeek-R1's strong reasoning with unrestricted output, making it useful for research requiring step-by-step analysis without artificial limitations.
|
||
|
|
|
||
|
|
## Quick Start
|
||
|
|
|
||
|
|
```python
|
||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||
|
|
|
||
|
|
model_id = "richardyoung/Deepseek-R1-Distill-Qwen-32b-uncensored"
|
||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
||
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
|
||
|
|
|
||
|
|
messages = [{"role": "user", "content": "Walk me through how RSA encryption works, step by step."}]
|
||
|
|
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
|
||
|
|
outputs = model.generate(inputs, max_new_tokens=1024)
|
||
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
||
|
|
```
|
||
|
|
|
||
|
|
## Model Details
|
||
|
|
|
||
|
|
- **Base model**: DeepSeek-R1-Distill-Qwen-32B (32 billion parameters)
|
||
|
|
- **Technique**: Abliteration — surgical removal of the refusal direction
|
||
|
|
- **Architecture**: Qwen2 (decoder-only transformer)
|
||
|
|
- **Context length**: 32,768 tokens
|
||
|
|
- **Key strength**: Chain-of-thought reasoning without safety guardrails
|
||
|
|
|
||
|
|
## Why This Model?
|
||
|
|
|
||
|
|
DeepSeek-R1 is one of the strongest open-source reasoning models. The distilled 32B version retains impressive chain-of-thought capabilities at a manageable size. Abliteration allows researchers to study the full range of the model's reasoning abilities without refusal interventions.
|
||
|
|
|
||
|
|
## Intended Use
|
||
|
|
|
||
|
|
Research on reasoning, alignment studies, education, and creative applications requiring step-by-step analysis.
|
||
|
|
|
||
|
|
## Other Models by richardyoung
|
||
|
|
|
||
|
|
- **Abliterated/Uncensored models**: [Qwen2.5-7B](https://hf.co/richardyoung/Qwen2.5-7B-Instruct-abliterated-GGUF) | [Qwen3-14B](https://hf.co/richardyoung/Qwen3-14B-abliterated-GGUF) | [DeepSeek-R1-32B](https://hf.co/richardyoung/Deepseek-R1-Distill-Qwen-32b-uncensored) | [Qwen3-8B](https://hf.co/richardyoung/Qwen3-8B-Abliterated)
|
||
|
|
- **MLX quantizations (Apple Silicon)**: [Kimi-K2 series](https://hf.co/richardyoung/Kimi-K2-Instruct-0905-MLX-4bit) | [olmOCR MLX](https://hf.co/richardyoung/olmOCR-2-7B-1025-MLX-4bit)
|
||
|
|
- **OCR & Vision**: [olmOCR GGUF](https://hf.co/richardyoung/olmOCR-2-7B-1025-GGUF)
|
||
|
|
- **Healthcare/Medical**: [Synthea 575K patients dataset](https://hf.co/datasets/richardyoung/synthea-575k-patients) | [CardioEmbed](https://hf.co/richardyoung/CardioEmbed)
|
||
|
|
- **Research**: [LLM Instruction-Following Evaluation](https://hf.co/richardyoung/llm-instruction-following-paper) (arxiv:2510.18892)
|