初始化项目,由ModelHub XC社区提供模型
Model: richardyoung/Mistral-7B-Instruct-v0.2-abliterated-obliteratus Source: Original Platform
This commit is contained in:
35
.gitattributes
vendored
Normal file
35
.gitattributes
vendored
Normal file
@@ -0,0 +1,35 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
87
README.md
Normal file
87
README.md
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
language:
|
||||
- en
|
||||
license: apache-2.0
|
||||
library_name: transformers
|
||||
base_model: mistralai/Mistral-7B-Instruct-v0.2
|
||||
tags:
|
||||
- abliteration
|
||||
- uncensored
|
||||
- OBLITERATUS
|
||||
- representation-engineering
|
||||
- refusal-removal
|
||||
pipeline_tag: text-generation
|
||||
model-index:
|
||||
- name: Mistral-7B-Instruct-v0.2-abliterated-obliteratus
|
||||
results:
|
||||
- task:
|
||||
type: text-generation
|
||||
metrics:
|
||||
- name: Refusal Rate
|
||||
type: refusal_rate
|
||||
value: 85/100
|
||||
- name: Attack Success Rate
|
||||
type: asr
|
||||
value: 15.0
|
||||
- name: KL Divergence
|
||||
type: kl_divergence
|
||||
value: 0.4224
|
||||
---
|
||||
|
||||
# Mistral-7B-Instruct-v0.2-abliterated-obliteratus
|
||||
|
||||
This model is an abliterated (uncensored) version of [Mistral-7B-Instruct-v0.2](mistralai/Mistral-7B-Instruct-v0.2) created using [OBLITERATUS](https://github.com/elder-plinius/OBLITERATUS) (advanced method).
|
||||
|
||||
## Abliteration Results
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| **Refusals** | 85/100 |
|
||||
| **Attack Success Rate (ASR)** | 15.0% |
|
||||
| **KL Divergence** | 0.4224 |
|
||||
| **Method** | OBLITERATUS (advanced) |
|
||||
| **GPU** | NVIDIA RTX PRO 6000 Blackwell |
|
||||
|
||||
## What is Abliteration?
|
||||
|
||||
Abliteration is a technique for removing refusal behavior from language models by identifying and orthogonalizing the "refusal direction" in the model's residual stream activation space. This model was created as part of the research paper:
|
||||
|
||||
> **Comparative Analysis of LLM Abliteration Methods: Scaling to MoE Architectures and Modern Tools**
|
||||
> Richard Young (2026). arXiv: [2512.13655](https://arxiv.org/abs/2512.13655)
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained("richardyoung/Mistral-7B-Instruct-v0.2-abliterated-obliteratus", device_map="auto")
|
||||
tokenizer = AutoTokenizer.from_pretrained("richardyoung/Mistral-7B-Instruct-v0.2-abliterated-obliteratus")
|
||||
|
||||
messages = [{"role": "user", "content": "Your prompt here"}]
|
||||
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
|
||||
outputs = model.generate(inputs, max_new_tokens=256)
|
||||
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
||||
```
|
||||
|
||||
## Disclaimer
|
||||
|
||||
This model is released for research purposes only. The abliteration process removes safety guardrails. Users are responsible for ensuring appropriate use. This model should not be used to generate harmful, illegal, or unethical content.
|
||||
|
||||
## Dashboard
|
||||
|
||||
Interactive results dashboard: [abliteration-methods-dashboard](https://huggingface.co/spaces/richardyoung/abliteration-methods-dashboard)
|
||||
|
||||
## Collection
|
||||
|
||||
Part of the [Uncensored and Abliterated LLMs](https://huggingface.co/collections/richardyoung/uncensored-and-abliterated-llms) collection.
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@article{young2024abliteration,
|
||||
title={Comparative Analysis of LLM Abliteration Methods},
|
||||
author={Young, Richard},
|
||||
journal={arXiv preprint arXiv:2512.13655},
|
||||
year={2024}
|
||||
}
|
||||
```
|
||||
57
abliteration_metadata.json
Normal file
57
abliteration_metadata.json
Normal file
@@ -0,0 +1,57 @@
|
||||
{
|
||||
"source_model": "mistralai/Mistral-7B-Instruct-v0.2",
|
||||
"technique": "refusal_direction_ablation",
|
||||
"method": "advanced",
|
||||
"method_config": {
|
||||
"n_directions": 4,
|
||||
"direction_method": "svd",
|
||||
"norm_preserve": true,
|
||||
"regularization": 0.3,
|
||||
"refinement_passes": 2,
|
||||
"project_biases": true,
|
||||
"use_chat_template": true,
|
||||
"use_whitened_svd": false,
|
||||
"true_iterative_refinement": false,
|
||||
"winsorize_activations": false,
|
||||
"float_layer_interpolation": false,
|
||||
"cot_aware": false,
|
||||
"use_kl_optimization": false,
|
||||
"use_lora_ablation": false,
|
||||
"spectral_cascade": false,
|
||||
"spectral_bands": 3,
|
||||
"spectral_threshold": 0.05
|
||||
},
|
||||
"references": [
|
||||
"Arditi et al., Refusal in Language Models Is Mediated by a Single Direction (NeurIPS 2024)",
|
||||
"Gabliteration: SVD-based multi-direction extraction (arXiv:2512.18901)",
|
||||
"Norm-Preserving Biprojected Abliteration (grimjim, 2025)",
|
||||
"Young, Comparative Analysis of LLM Abliteration Methods (arXiv:2512.13655)",
|
||||
"Joad et al., More to Refusal than a Single Direction (2026)",
|
||||
"Heretic (p-e-w, 2025): Bayesian optimization, LoRA-mediated ablation, winsorization",
|
||||
"OBLITERATUS: Whitened SVD, EGA, CoT-aware, KL co-optimization, float interpolation (novel)"
|
||||
],
|
||||
"strong_layers": [
|
||||
31,
|
||||
30,
|
||||
29,
|
||||
28,
|
||||
27,
|
||||
26,
|
||||
25,
|
||||
24,
|
||||
22
|
||||
],
|
||||
"n_harmful_prompts": 512,
|
||||
"n_harmless_prompts": 512,
|
||||
"quality_metrics": {
|
||||
"perplexity": 3.798783265463755,
|
||||
"coherence": 1.0,
|
||||
"refusal_rate": 0.13333333333333333,
|
||||
"kl_divergence": 0.4253176748752594,
|
||||
"spectral_certification": "RED"
|
||||
},
|
||||
"kl_contributions": {},
|
||||
"cot_preserved_layers": [],
|
||||
"float_layer_weights": {},
|
||||
"lora_adapters_saved": false
|
||||
}
|
||||
24
chat_template.jinja
Normal file
24
chat_template.jinja
Normal file
@@ -0,0 +1,24 @@
|
||||
{%- if messages[0]['role'] == 'system' %}
|
||||
{%- set system_message = messages[0]['content'] %}
|
||||
{%- set loop_messages = messages[1:] %}
|
||||
{%- else %}
|
||||
{%- set loop_messages = messages %}
|
||||
{%- endif %}
|
||||
|
||||
{{- bos_token }}
|
||||
{%- for message in loop_messages %}
|
||||
{%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}
|
||||
{{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}
|
||||
{%- endif %}
|
||||
{%- if message['role'] == 'user' %}
|
||||
{%- if loop.first and system_message is defined %}
|
||||
{{- ' [INST] ' + system_message + '\n\n' + message['content'] + ' [/INST]' }}
|
||||
{%- else %}
|
||||
{{- ' [INST] ' + message['content'] + ' [/INST]' }}
|
||||
{%- endif %}
|
||||
{%- elif message['role'] == 'assistant' %}
|
||||
{{- ' ' + message['content'] + eos_token}}
|
||||
{%- else %}
|
||||
{{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
26
config.json
Normal file
26
config.json
Normal file
@@ -0,0 +1,26 @@
|
||||
{
|
||||
"architectures": [
|
||||
"MistralForCausalLM"
|
||||
],
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 1,
|
||||
"dtype": "bfloat16",
|
||||
"eos_token_id": 2,
|
||||
"head_dim": null,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 14336,
|
||||
"max_position_embeddings": 32768,
|
||||
"model_type": "mistral",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 32,
|
||||
"num_key_value_heads": 8,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_theta": 1000000.0,
|
||||
"sliding_window": null,
|
||||
"tie_word_embeddings": false,
|
||||
"transformers_version": "4.57.6",
|
||||
"use_cache": true,
|
||||
"vocab_size": 32000
|
||||
}
|
||||
6
generation_config.json
Normal file
6
generation_config.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"_from_model_config": true,
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"transformers_version": "4.57.6"
|
||||
}
|
||||
3
model-00001-of-00008.safetensors
Normal file
3
model-00001-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2c547794844eaec7c9eb02c0f90090ee1f5bd211a650c9f1187de9c5bac16c0b
|
||||
size 1889587040
|
||||
3
model-00002-of-00008.safetensors
Normal file
3
model-00002-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:8fff8cb3e8ed7622c524aa1103e146a66f131b887a0488121c8519cc311363aa
|
||||
size 1946243936
|
||||
3
model-00003-of-00008.safetensors
Normal file
3
model-00003-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:da11b8042d9d97939d72baa538c2dad2df85ca470584e9f17753217369baeab7
|
||||
size 1979781432
|
||||
3
model-00004-of-00008.safetensors
Normal file
3
model-00004-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3363ff5b573896e1f9584cad53c54eec691e09037e6d363ae04bfbfcc51a5453
|
||||
size 1946243984
|
||||
3
model-00005-of-00008.safetensors
Normal file
3
model-00005-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fc1b65e3e16832a2ab9a3d3d403b29955c413a5a6b7ec96ed6eddf3f8a905188
|
||||
size 1979781448
|
||||
3
model-00006-of-00008.safetensors
Normal file
3
model-00006-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:97a44db08fa13b53cbd9d36682bfa7c0be6fb9ac2857de9522b7e0fab2b3cf33
|
||||
size 1946243984
|
||||
3
model-00007-of-00008.safetensors
Normal file
3
model-00007-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:bf67fcadd7d7722fe9eba5acbe817b5728d5554072e1bda49f4e08e97b3e5eb2
|
||||
size 1979781448
|
||||
3
model-00008-of-00008.safetensors
Normal file
3
model-00008-of-00008.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:01c3c915fbe700989078cca0515b1f8191e3c956feec2e90d826044e9e9641ae
|
||||
size 815834680
|
||||
299
model.safetensors.index.json
Normal file
299
model.safetensors.index.json
Normal file
@@ -0,0 +1,299 @@
|
||||
{
|
||||
"metadata": {
|
||||
"total_parameters": 7241732096,
|
||||
"total_size": 14483464192
|
||||
},
|
||||
"weight_map": {
|
||||
"lm_head.weight": "model-00008-of-00008.safetensors",
|
||||
"model.embed_tokens.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.input_layernorm.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.input_layernorm.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.10.input_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.10.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.input_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.11.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.12.input_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.12.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.12.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.12.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.12.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.12.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.12.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.13.input_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.13.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.input_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.14.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.input_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.15.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.input_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.16.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.17.input_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.17.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.17.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.17.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.17.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.17.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.17.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.17.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.17.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
|
||||
"model.layers.18.input_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.18.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.input_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.19.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.2.input_layernorm.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.20.input_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.20.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.21.input_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.21.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.21.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.21.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.21.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.21.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.21.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.21.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.21.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
|
||||
"model.layers.22.input_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.22.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.input_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.23.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.input_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.24.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.input_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.25.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.26.input_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.26.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.26.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.26.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.26.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.26.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.26.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.26.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.26.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
|
||||
"model.layers.27.input_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.27.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.input_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.28.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.input_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.29.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.3.input_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
|
||||
"model.layers.30.input_layernorm.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.30.mlp.down_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.30.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.30.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.30.post_attention_layernorm.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.30.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.30.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.30.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.30.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
|
||||
"model.layers.31.input_layernorm.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.mlp.down_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.mlp.gate_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.mlp.up_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.post_attention_layernorm.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.self_attn.k_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.self_attn.o_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.self_attn.q_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.31.self_attn.v_proj.weight": "model-00008-of-00008.safetensors",
|
||||
"model.layers.4.input_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.input_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.input_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.input_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.7.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.8.input_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.8.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.8.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.8.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.8.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
|
||||
"model.layers.9.input_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
|
||||
"model.norm.weight": "model-00008-of-00008.safetensors"
|
||||
}
|
||||
}
|
||||
24
special_tokens_map.json
Normal file
24
special_tokens_map.json
Normal file
@@ -0,0 +1,24 @@
|
||||
{
|
||||
"bos_token": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"eos_token": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": "</s>",
|
||||
"unk_token": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
268065
tokenizer.json
Normal file
268065
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
BIN
tokenizer.model
(Stored with Git LFS)
Normal file
BIN
tokenizer.model
(Stored with Git LFS)
Normal file
Binary file not shown.
44
tokenizer_config.json
Normal file
44
tokenizer_config.json
Normal file
@@ -0,0 +1,44 @@
|
||||
{
|
||||
"add_bos_token": true,
|
||||
"add_eos_token": false,
|
||||
"add_prefix_space": null,
|
||||
"added_tokens_decoder": {
|
||||
"0": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"1": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"2": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [],
|
||||
"bos_token": "<s>",
|
||||
"clean_up_tokenization_spaces": false,
|
||||
"eos_token": "</s>",
|
||||
"extra_special_tokens": {},
|
||||
"legacy": false,
|
||||
"model_max_length": 1000000000000000019884624838656,
|
||||
"pad_token": "</s>",
|
||||
"sp_model_kwargs": {},
|
||||
"spaces_between_special_tokens": false,
|
||||
"tokenizer_class": "LlamaTokenizer",
|
||||
"unk_token": "<unk>",
|
||||
"use_default_system_prompt": false
|
||||
}
|
||||
Reference in New Issue
Block a user