初始化项目,由ModelHub XC社区提供模型

Model: richardyoung/DeepSeek-R1-Distill-Qwen-7B-abliterated-obliteratus
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-16 08:16:45 +08:00
commit badb853b03
19 changed files with 842 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

87
README.md Normal file
View File

@@ -0,0 +1,87 @@
---
language:
- en
license: mit
library_name: transformers
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
tags:
- abliteration
- uncensored
- OBLITERATUS
- representation-engineering
- refusal-removal
pipeline_tag: text-generation
model-index:
- name: DeepSeek-R1-Distill-Qwen-7B-abliterated-obliteratus
results:
- task:
type: text-generation
metrics:
- name: Refusal Rate
type: refusal_rate
value: 50/100
- name: Attack Success Rate
type: asr
value: 50.0
- name: KL Divergence
type: kl_divergence
value: 1.191
---
# DeepSeek-R1-Distill-Qwen-7B-abliterated-obliteratus
This model is an abliterated (uncensored) version of [DeepSeek-R1-Distill-Qwen-7B](deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) created using [OBLITERATUS](https://github.com/elder-plinius/OBLITERATUS) (advanced method).
## Abliteration Results
| Metric | Value |
|--------|-------|
| **Refusals** | 50/100 |
| **Attack Success Rate (ASR)** | 50.0% |
| **KL Divergence** | 1.191 |
| **Method** | OBLITERATUS (advanced) |
| **GPU** | NVIDIA RTX PRO 6000 Blackwell |
## What is Abliteration?
Abliteration is a technique for removing refusal behavior from language models by identifying and orthogonalizing the "refusal direction" in the model's residual stream activation space. This model was created as part of the research paper:
> **Comparative Analysis of LLM Abliteration Methods: Scaling to MoE Architectures and Modern Tools**
> Richard Young (2026). arXiv: [2512.13655](https://arxiv.org/abs/2512.13655)
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("richardyoung/DeepSeek-R1-Distill-Qwen-7B-abliterated-obliteratus", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("richardyoung/DeepSeek-R1-Distill-Qwen-7B-abliterated-obliteratus")
messages = [{"role": "user", "content": "Your prompt here"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Disclaimer
This model is released for research purposes only. The abliteration process removes safety guardrails. Users are responsible for ensuring appropriate use. This model should not be used to generate harmful, illegal, or unethical content.
## Dashboard
Interactive results dashboard: [abliteration-methods-dashboard](https://huggingface.co/spaces/richardyoung/abliteration-methods-dashboard)
## Collection
Part of the [Uncensored and Abliterated LLMs](https://huggingface.co/collections/richardyoung/uncensored-and-abliterated-llms) collection.
## Citation
```bibtex
@article{young2024abliteration,
title={Comparative Analysis of LLM Abliteration Methods},
author={Young, Richard},
journal={arXiv preprint arXiv:2512.13655},
year={2024}
}
```

View File

@@ -0,0 +1,56 @@
{
"source_model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
"technique": "refusal_direction_ablation",
"method": "advanced",
"method_config": {
"n_directions": 4,
"direction_method": "svd",
"norm_preserve": true,
"regularization": 0.3,
"refinement_passes": 2,
"project_biases": true,
"use_chat_template": true,
"use_whitened_svd": false,
"true_iterative_refinement": false,
"winsorize_activations": false,
"float_layer_interpolation": false,
"cot_aware": false,
"use_kl_optimization": false,
"use_lora_ablation": false,
"spectral_cascade": false,
"spectral_bands": 3,
"spectral_threshold": 0.05
},
"references": [
"Arditi et al., Refusal in Language Models Is Mediated by a Single Direction (NeurIPS 2024)",
"Gabliteration: SVD-based multi-direction extraction (arXiv:2512.18901)",
"Norm-Preserving Biprojected Abliteration (grimjim, 2025)",
"Young, Comparative Analysis of LLM Abliteration Methods (arXiv:2512.13655)",
"Joad et al., More to Refusal than a Single Direction (2026)",
"Heretic (p-e-w, 2025): Bayesian optimization, LoRA-mediated ablation, winsorization",
"OBLITERATUS: Whitened SVD, EGA, CoT-aware, KL co-optimization, float interpolation (novel)"
],
"strong_layers": [
27,
26,
25,
24,
23,
22,
21,
20
],
"n_harmful_prompts": 512,
"n_harmless_prompts": 512,
"quality_metrics": {
"perplexity": 47.104726995215096,
"coherence": 1.0,
"refusal_rate": 0.03333333333333333,
"kl_divergence": 0.4912732243537903,
"spectral_certification": "RED"
},
"kl_contributions": {},
"cot_preserved_layers": [],
"float_layer_weights": {},
"lora_adapters_saved": false
}

1
chat_template.jinja Normal file
View File

@@ -0,0 +1 @@
{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %}{%- for message in messages %}{%- if message['role'] == 'system' %}{% set ns.system_prompt = message['content'] %}{%- endif %}{%- endfor %}{{bos_token}}{{ns.system_prompt}}{%- for message in messages %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{{'<User>' + message['content']}}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is none %}{%- set ns.is_tool = false -%}{%- for tool in message['tool_calls']%}{%- if not ns.is_first %}{{'<Assistant><tool▁calls▁begin><tool▁call▁begin>' + tool['type'] + '<tool▁sep>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<tool▁call▁end>'}}{%- set ns.is_first = true -%}{%- else %}{{'\n' + '<tool▁call▁begin>' + tool['type'] + '<tool▁sep>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<tool▁call▁end>'}}{{'<tool▁calls▁end><end▁of▁sentence>'}}{%- endif %}{%- endfor %}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is not none %}{%- if ns.is_tool %}{{'<tool▁outputs▁end>' + message['content'] + '<end▁of▁sentence>'}}{%- set ns.is_tool = false -%}{%- else %}{% set content = message['content'] %}{% if '</think>' in content %}{% set content = content.split('</think>')[-1] %}{% endif %}{{'<Assistant>' + content + '<end▁of▁sentence>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<tool▁outputs▁begin><tool▁output▁begin>' + message['content'] + '<tool▁output▁end>'}}{%- set ns.is_output_first = false %}{%- else %}{{'\n<tool▁output▁begin>' + message['content'] + '<tool▁output▁end>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<tool▁outputs▁end>'}}{% endif %}{% if add_generation_prompt and not ns.is_tool %}{{'<Assistant><think>\n'}}{% endif %}

59
config.json Normal file
View File

@@ -0,0 +1,59 @@
{
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"dtype": "bfloat16",
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 3584,
"initializer_range": 0.02,
"intermediate_size": 18944,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 131072,
"max_window_layers": 28,
"model_type": "qwen2",
"num_attention_heads": 28,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 10000,
"sliding_window": null,
"tie_word_embeddings": false,
"transformers_version": "4.57.6",
"use_cache": true,
"use_mrope": false,
"use_sliding_window": false,
"vocab_size": 152064
}

9
generation_config.json Normal file
View File

@@ -0,0 +1,9 @@
{
"_from_model_config": true,
"bos_token_id": 151646,
"do_sample": true,
"eos_token_id": 151643,
"temperature": 0.6,
"top_p": 0.95,
"transformers_version": "4.57.6"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:db3c17c1652205a8d858a445728bba5ea07f0c53ad96617f32faccfea6bce726
size 1886423520

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1b45fe1bcc9b1d4e468adbd0d9517ff5cf1aa7a0ebe71a88ea09a03a906332c0
size 1864467800

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2a9897e7450d6c7726e470a8589d5eb0339c791a23ec281bd8a43e9cefc3c453
size 1864467800

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ee7c8e4207e868ceb05492448950803c7ea021b33f152392728d68dbccc4570d
size 1864467824

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b40ceac3e26c2d42a3b868498c2942aec4edec34de68424aa761ca6072aed9c7
size 1864467848

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3cfacb9556eeb1680e2c9fcba25b613178c1349238c85ac70cd81ab3c59e584c
size 1864467848

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:07a3d7e9b6de3e0f21755cafe159e7a442caebd352fec4f2f59add8411d1e192
size 1864467848

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a7794636fbe769f524f39c2e26f1cc8d115b61425f1141b35a3410686e813624
size 1068046456

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0098f56c6241b5b7480d340ab357b1e54a3a04ace831a0b8843133497528429d
size 1089994880

View File

@@ -0,0 +1,347 @@
{
"metadata": {
"total_parameters": 7615616512,
"total_size": 15231233024
},
"weight_map": {
"lm_head.weight": "model-00009-of-00009.safetensors",
"model.embed_tokens.weight": "model-00001-of-00009.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00009.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00009.safetensors",
"model.layers.0.self_attn.k_proj.bias": "model-00001-of-00009.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.0.self_attn.q_proj.bias": "model-00001-of-00009.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.0.self_attn.v_proj.bias": "model-00001-of-00009.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.1.input_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.1.self_attn.k_proj.bias": "model-00001-of-00009.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.1.self_attn.q_proj.bias": "model-00001-of-00009.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.1.self_attn.v_proj.bias": "model-00001-of-00009.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00009.safetensors",
"model.layers.10.input_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.10.self_attn.k_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.10.self_attn.q_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.10.self_attn.v_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.11.input_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.11.self_attn.k_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.11.self_attn.q_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.11.self_attn.v_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.12.input_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.12.self_attn.k_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.12.self_attn.q_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.12.self_attn.v_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.13.input_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.13.self_attn.k_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.13.self_attn.q_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.13.self_attn.v_proj.bias": "model-00004-of-00009.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.14.input_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.14.self_attn.k_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.14.self_attn.q_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.14.self_attn.v_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.15.input_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.15.self_attn.k_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.15.self_attn.q_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.15.self_attn.v_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.16.input_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00005-of-00009.safetensors",
"model.layers.16.self_attn.k_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.16.self_attn.q_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.16.self_attn.v_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.17.input_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.17.self_attn.k_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.17.self_attn.q_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.17.self_attn.v_proj.bias": "model-00005-of-00009.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00005-of-00009.safetensors",
"model.layers.18.input_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.18.self_attn.k_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.18.self_attn.q_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.18.self_attn.v_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.19.input_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.19.self_attn.k_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.19.self_attn.q_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.19.self_attn.v_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.2.input_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.2.self_attn.k_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.2.self_attn.q_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.2.self_attn.v_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.20.input_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00006-of-00009.safetensors",
"model.layers.20.self_attn.k_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.20.self_attn.q_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.20.self_attn.v_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.21.input_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.21.self_attn.k_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.21.self_attn.q_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.21.self_attn.v_proj.bias": "model-00006-of-00009.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00006-of-00009.safetensors",
"model.layers.22.input_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.22.self_attn.k_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.22.self_attn.q_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.22.self_attn.v_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.23.input_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.23.self_attn.k_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.23.self_attn.q_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.23.self_attn.v_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.24.input_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00007-of-00009.safetensors",
"model.layers.24.self_attn.k_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.24.self_attn.q_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.24.self_attn.v_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.25.input_layernorm.weight": "model-00008-of-00009.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00008-of-00009.safetensors",
"model.layers.25.self_attn.k_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.25.self_attn.q_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.25.self_attn.v_proj.bias": "model-00007-of-00009.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00007-of-00009.safetensors",
"model.layers.26.input_layernorm.weight": "model-00008-of-00009.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00008-of-00009.safetensors",
"model.layers.26.self_attn.k_proj.bias": "model-00008-of-00009.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.26.self_attn.q_proj.bias": "model-00008-of-00009.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.26.self_attn.v_proj.bias": "model-00008-of-00009.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.27.input_layernorm.weight": "model-00008-of-00009.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00008-of-00009.safetensors",
"model.layers.27.self_attn.k_proj.bias": "model-00008-of-00009.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.27.self_attn.q_proj.bias": "model-00008-of-00009.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.27.self_attn.v_proj.bias": "model-00008-of-00009.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00008-of-00009.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.3.self_attn.k_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.3.self_attn.q_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.3.self_attn.v_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.4.input_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00009.safetensors",
"model.layers.4.self_attn.k_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.4.self_attn.q_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.4.self_attn.v_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.5.input_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.5.self_attn.k_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.5.self_attn.q_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.5.self_attn.v_proj.bias": "model-00002-of-00009.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00009.safetensors",
"model.layers.6.input_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.6.self_attn.k_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.6.self_attn.q_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.6.self_attn.v_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.7.self_attn.k_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.7.self_attn.q_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.7.self_attn.v_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.8.input_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00009.safetensors",
"model.layers.8.self_attn.k_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.8.self_attn.q_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.8.self_attn.v_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.9.input_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00004-of-00009.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00004-of-00009.safetensors",
"model.layers.9.self_attn.k_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.9.self_attn.q_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00009.safetensors",
"model.layers.9.self_attn.v_proj.bias": "model-00003-of-00009.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00009.safetensors",
"model.norm.weight": "model-00008-of-00009.safetensors"
}
}

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<begin▁of▁sentence>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<end▁of▁sentence>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<end▁of▁sentence>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e064ab12a3e3159997af01d74c52db62b097c176f14f24665d816d200bcb78e5
size 11423057

194
tokenizer_config.json Normal file
View File

@@ -0,0 +1,194 @@
{
"add_bos_token": true,
"add_eos_token": false,
"add_prefix_space": null,
"added_tokens_decoder": {
"151643": {
"content": "<end▁of▁sentence>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<User>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151645": {
"content": "<Assistant>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151646": {
"content": "<begin▁of▁sentence>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|EOT|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151648": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151649": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"bos_token": "<begin▁of▁sentence>",
"clean_up_tokenization_spaces": false,
"eos_token": "<end▁of▁sentence>",
"extra_special_tokens": {},
"legacy": true,
"model_max_length": 16384,
"pad_token": "<end▁of▁sentence>",
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizerFast",
"unk_token": null,
"use_default_system_prompt": false
}