初始化项目,由ModelHub XC社区提供模型

Model: RumiaChannel/llm-jp-4-8b-thinking-uncensored-ara-gguf
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-18 05:28:17 +08:00
commit aba5a6fee0
7 changed files with 223 additions and 0 deletions

40
.gitattributes vendored Normal file
View File

@@ -0,0 +1,40 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
llm-jp-4-8b-thinking-uncensored-ara.BF16.gguf filter=lfs diff=lfs merge=lfs -text
llm-jp-4-8b-thinking-uncensored-ara.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
llm-jp-4-8b-thinking-uncensored-ara.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
llm-jp-4-8b-thinking-uncensored-ara.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
llm-jp-4-8b-thinking-uncensored-ara.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

168
README.md Normal file
View File

@@ -0,0 +1,168 @@
---
license: apache-2.0
language:
- en
- ja
programming_language:
- C
- C++
- C#
- Go
- Java
- JavaScript
- Lua
- PHP
- Python
- Ruby
- Rust
- Scala
- TypeScript
pipeline_tag: text-generation
library_name: transformers
inference: false
tags:
- heretic
- uncensored
- decensored
- abliterated
- ara
base_model: tokinasin/llm-jp-4-8b-thinking-uncensored-ara
base_model_relation: quantized
---
[Heretic](https://github.com/p-e-w/heretic) の [PR #211](https://github.com/p-e-w/heretic/pull/211) で提案されている [Arbitrary-Rank Ablation (ARA)](https://github.com/p-e-w/heretic/pull/211) を用いて [llm-jp/llm-jp-4-8b-thinking](https://huggingface.co/llm-jp/llm-jp-4-8b-thinking) に対して検閲解除を行ったモデルです。
## Abliteration parameters
| Parameter | Value |
| :-------- | :---: |
| **start_layer_index** | 16 |
| **end_layer_index** | 28 |
| **preserve_good_behavior_weight** | 0.3326 |
| **steer_bad_behavior_weight** | 0.0048 |
| **overcorrect_relative_weight** | 1.0004 |
| **neighbor_count** | 15 |
## Performance
| Metric | This model | Original model ([llm-jp/llm-jp-4-8b-thinking](https://huggingface.co/llm-jp/llm-jp-4-8b-thinking)) |
| :----- | :--------: | :---------------------------: |
| **KL divergence** | 0.0129 | 0 *(by definition)* |
| **Refusals** | 5/100 | 100/100 |
-----
# llm-jp-4-8b-thinking
LLM-jp-4 is a series of large language models developed by the [Research and Development Center for Large Language Models](https://llmc.nii.ac.jp/) at the [National Institute of Informatics](https://www.nii.ac.jp/en/).
This repository provides the **llm-jp-4-8b-thinking** model.
For an overview of the LLM-jp-4 models across different parameter sizes, please refer to:
- [LLM-jp-4 Models](https://huggingface.co/collections/llm-jp/llm-jp-4-models)
Base models are trained with pre-training and mid-training only.
Post-trained models are aligned using supervised fine-tuning (SFT) and direct preference optimization (DPO), without reinforcement learning.
For practical usage examples and detailed instructions on how to use the models, please also refer to our [cookbook](https://github.com/llm-jp/llm-jp-4-cookbook).
## Usage
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "llm-jp/llm-jp-4-8b-thinking"
tokenizer = AutoTokenizer.from_pretrained(
model_name,
# trust_remote_code is required to load custom tokenizer and reasoning parser.
trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
model.eval()
messages = [
{"role": "user", "content": "自然言語処理とは何か"},
]
prompt: str = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
reasoning_effort="medium", # {"low", "medium", "high"}
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output_tensor = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
top_p=0.9,
)
generated_ids: list[int] = output_tensor[0][inputs["input_ids"].shape[1]:].tolist()
response = tokenizer.decode(generated_ids)
parsed = tokenizer.parse_response(response)
print("\n--- Parsed Response ---")
print("Role:", parsed.get("role"))
print("Thinking:", parsed.get("thinking"))
print("Content:", parsed.get("content"))
```
## Model Details
- **Model type:** Transformer-based Language Model
- **Architectures:**
Dense model:
|Params|Layers|Hidden size|Heads|Context length|Embedding parameters|Non-embedding parameters|Total parameters|
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|8B|32|4,096|32|65,536|805,306,368|7,784,894,464|8,590,200,832|
MoE model:
|Params|Layers|Hidden size|Heads|Routed Experts|Activated Experts|Context length|Embedding parameters|Non-embedding parameters|Activated parameters|Total parameters|
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|32B-A3B|32|2,560|40|128|8|65,536|503,316,480|31,635,712,512|3,827,476,992|32,139,028,992|
## Tokenizer
The tokenizer of this model is based on [huggingface/tokenizers](https://github.com/huggingface/tokenizers) Unigram byte-fallback model.
The vocabulary entries were converted from [`llm-jp-tokenizer v4.0`](https://github.com/llm-jp/llm-jp-tokenizer).
Please refer to [README.md](https://github.com/llm-jp/llm-jp-tokenizer) of `llm-jp-tokenizer` for details on the vocabulary construction procedure (the pure SentencePiece training does not reproduce our vocabulary).
> [!NOTE]
> The chat template of this model is designed to be compatible with the OpenAI Harmony response format.
> However, the tokenizer differs from the one assumed by the `openai-harmony` library, and therefore direct tokenization with `openai-harmony` is not supported.
> For correct behavior, please use the tokenizer provided with this model. For detailed usage, please refer to [our cookbook](https://github.com/llm-jp/llm-jp-4-cookbook).
## Training
### Pre-training
This model is trained through a multi-stage pipeline consisting of pre-training and mid-training phases, using a total of 11.7T tokens.
![pretraining_overview](./v4_pretraining_overview.png)
The corpora used for pre-training and mid-training are publicly available at the following links:
- [Pre-training](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v4.1)
- [Mid-training](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-midtraining-v2)
> [!NOTE]
> Although most of the corpora have been released, some portions are excluded from public release due to licensing constraints.
### Post-training
We have fine-tuned the pre-trained checkpoint using SFT and further aligned it with DPO.
The datasets used for post-training are also publicly available at the following links:
- [SFT](https://huggingface.co/datasets/llm-jp/llm-jp-4-thinking-sft-data)
- [DPO (for llm-jp-4-8b-thinking model)](https://huggingface.co/datasets/llm-jp/llm-jp-4-8b-thinking-dpo-data)
- [DPO (for llm-jp-4-32b-a3b-thinking model)](https://huggingface.co/datasets/llm-jp/llm-jp-4-32b-a3b-thinking-dpo-data)
## Evaluation
### [llm-jp-judge](https://github.com/llm-jp/llm-jp-judge)
We evaluated the model on a variety of tasks using an LLM-as-a-Judge framework. The descriptions of each task are as follows.
- MT-Bench (JA/EN): A benchmark for measuring multi-turn conversational task-solving ability.
- [AnswerCarefully](https://huggingface.co/datasets/llm-jp/AnswerCarefully): A benchmark for evaluating safety in Japanese. We used 336 questions from the v2.0 test set.
- [llm-jp-instructions](https://huggingface.co/datasets/llm-jp/llm-jp-instructions): A set of human-created single-turn questionanswer pairs. We used 400 questions from the test set.
We evaluated the models using `gpt-5.4-2026-03-05`.
> [!NOTE]
> Note: In earlier evaluations of the llm-jp-3 series, we used `gpt-4o-2024-08-06`. The newer evaluator `gpt-5.4-2026-03-05` provides a stricter and more reliable assessment, which results in lower scores on benchmarks such as MT-Bench compared to those reported for the llm-jp-3 series.
The scores represent the average values obtained from three rounds of inference and evaluation.
For more details, please refer to the [codes](https://github.com/llm-jp/llm-jp-judge).
| Model Name | MT-Bench (JA) | MT-Bench (EN) | AnswerCarefully | llm-jp-instructions |
|:-------------------------------------------------------------------------------------------------------|----:|----:|----------------:|--------------------:|
| gpt-4o-2024-08-06 | 7.29 | 7.69 | 4.00 | 4.07 |
| gpt-5.4-2026-03-05 (reasoning_effort = low) | 8.87 | 8.76 | 4.38 | 4.79 |
| gpt-5.4-2026-03-05 (reasoning_effort = medium) | 8.87 | 8.89 | 4.43 | 4.82 |
| gpt-5.4-2026-03-05 (reasoning_effort = high) | 8.98 | 8.85 | 4.41 | 4.83 |
| [gpt-oss-20b (reasoning_effort = low)](https://huggingface.co/openai/gpt-oss-20b) | 7.21 | 7.95 | 3.39 | 3.08 |
| [gpt-oss-20b (reasoning_effort = medium)](https://huggingface.co/openai/gpt-oss-20b) | 7.33 | 7.85 | 3.55 | 3.16 |
| [llm-jp-4-8b-thinking (reasoning_effort = low)](https://huggingface.co/llm-jp/llm-jp-4-8b-thinking) | 7.23 | 7.54 | 3.58 | 3.50 |
| [llm-jp-4-8b-thinking (reasoning_effort = medium)](https://huggingface.co/llm-jp/llm-jp-4-8b-thinking) | 7.54 | 7.79 | 3.69 | 3.54 |
| [llm-jp-4-32b-a3b-thinking (reasoning_effort = low)](https://huggingface.co/llm-jp/llm-jp-4-32b-a3b-thinking) | 7.57 | 7.70 | 3.61 | 3.61 |
| [llm-jp-4-32b-a3b-thinking (reasoning_effort = medium)](https://huggingface.co/llm-jp/llm-jp-4-32b-a3b-thinking) | 7.82 | 7.86 | 3.70 | 3.61 |
## Risks and Limitations
The models released here are in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.
## Send Questions to
llm-jp(at)nii.ac.jp
## License
[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
## Acknowledgement
To develop this model, we used the NINJAL Web Japanese Corpus (whole-NWJC) from the National Institute for Japanese Language and Linguistics (NINJAL).
## Model Card Authors
*The names are listed in alphabetical order.*
Hirokazu Kiyomaru and Takashi Kodama.

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2d0aa6ede0b02fb5125943931fda7186a41d363308420637e59502efcb218996
size 17185772224

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:736489d7c2de315bab1b268c471a76b0fbfdf22cc55e833e20834fd7bb510f1c
size 5304881856

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ee6d532bdf06b4cfb97c1e8cdadf62ad7a231bd3d7f764f990473567fab564f8
size 6152131264

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9de27da88ca0bbf55055675a3efb2200598f097556c706e88fe91888c20cdf4d
size 7052333760

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6736862a1fbf92600ae55a7634ea4a950524dc5be13ddfd5c692d84701c75afc
size 9132708544