From d5578f95112f0e42a5eed23065d20d93653c7047 Mon Sep 17 00:00:00 2001 From: ModelHub XC Date: Fri, 8 May 2026 17:08:32 +0800 Subject: [PATCH] =?UTF-8?q?=E5=88=9D=E5=A7=8B=E5=8C=96=E9=A1=B9=E7=9B=AE?= =?UTF-8?q?=EF=BC=8C=E7=94=B1ModelHub=20XC=E7=A4=BE=E5=8C=BA=E6=8F=90?= =?UTF-8?q?=E4=BE=9B=E6=A8=A1=E5=9E=8B?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Model: becnic/Qwen3-4B-Thinking-2507-Heretic-GGUF Source: Original Platform --- .gitattributes | 37 +++++++++ Qwen3-4B-Thinking-2507-Heretic-Q8_0.gguf | 3 + README.md | 100 +++++++++++++++++++++++ 3 files changed, 140 insertions(+) create mode 100644 .gitattributes create mode 100644 Qwen3-4B-Thinking-2507-Heretic-Q8_0.gguf create mode 100644 README.md diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..4ad5b14 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,37 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +Qwen3-4B-Thinking-2507-Heretic.gguf filter=lfs diff=lfs merge=lfs -text +Qwen3-4B-Thinking-2507-Heretic-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/Qwen3-4B-Thinking-2507-Heretic-Q8_0.gguf b/Qwen3-4B-Thinking-2507-Heretic-Q8_0.gguf new file mode 100644 index 0000000..26d27ee --- /dev/null +++ b/Qwen3-4B-Thinking-2507-Heretic-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a7f8009e771a96580c0985f0d6cde868385bdb1eb278b798b9e9e9871bb634e5 +size 4280404736 diff --git a/README.md b/README.md new file mode 100644 index 0000000..2fecd7f --- /dev/null +++ b/README.md @@ -0,0 +1,100 @@ +--- +license: apache-2.0 +license_link: https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507/blob/main/LICENSE +pipeline_tag: text-generation +base_model: +- becnic/Qwen3-4B-Thinking-2507-Heretic +language: +- en +- de +- fr +- it +- pt +- hi +- es +- th +--- + +# Qwen3-4B-Thinking-2507-Heretic-GGUF + +## Llamacpp imatrix Quantizations of Qwen3-4B-Thinking-2507-Heretic by becnic (from original Qwen3-4B-Thinking-2507) + +Using llama.cpp release b7120 for quantization. + +Original model: https://huggingface.co/becnic/Qwen3-4B-Thinking-2507-Heretic + +Run them in [LM Studio](https://lmstudio.ai/) + +Run them directly with [llama.cpp](https://github.com/ggerganov/llama.cpp), or any other llama.cpp based project + +## Download a file (not the whole branch) from below: + +| Filename | Quant type | File Size | Split | Description | +| -------- | ---------- | --------- | ----- | ----------- | +| [Qwen3-4B-Thinking-2507-Q8_0.gguf](https://huggingface.co/becnic/Qwen3-4B-Thinking-2507-Heretic-GGUF/blob/main/Qwen3-4B-Thinking-2507-Heretic-Q8_0.gguf) | Q8_0 | 4.28GB | false | Extremely high quality | + +## Downloading using huggingface-cli + +
+ Click to view download instructions + +First, make sure you have hugginface-cli installed: + +``` +pip install -U "huggingface_hub[cli]" +``` + +Then, you can target the specific file you want: + +``` +huggingface-cli download becnic/Qwen3-4B-Thinking-2507-Heretic-GGUF --include "Qwen3-4B-Thinking-2507-Q8_0.gguf" --local-dir ./ +``` + +If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run: + +``` +huggingface-cli download becnic/Qwen3-4B-Thinking-2507-Heretic-GGUF --include "Qwen3-4B-Thinking-2507-Q8_0.gguf/*" --local-dir ./ +``` + +
+ +## Abliteration parameters + +| Parameter | Value | +| :-------- | :---: | +| **direction_index** | 19.42 | +| **attn.o_proj.max_weight** | 1.23 | +| **attn.o_proj.max_weight_position** | 22.34 | +| **attn.o_proj.min_weight** | 0.69 | +| **attn.o_proj.min_weight_distance** | 10.42 | +| **mlp.down_proj.max_weight** | 1.12 | +| **mlp.down_proj.max_weight_position** | 29.64 | +| **mlp.down_proj.min_weight** | 1.08 | +| **mlp.down_proj.min_weight_distance** | 20.24 | + +## Performance + +| Metric | This model | Original model ([Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)) | +| :----- | :--------: | :---------------------------: | +| **KL divergence** | 0.06 | 0 *(by definition)* | +| **Refusals** | 6/100 | 96/100 | + +## Model Overview + +**Qwen3-4B-Thinking-2507** has the following features: +- Type: Causal Language Models +- Training Stage: Pretraining & Post-training +- Number of Parameters: 4.0B +- Number of Paramaters (Non-Embedding): 3.6B +- Number of Layers: 36 +- Number of Attention Heads (GQA): 32 for Q and 8 for KV +- Context Length: **262,144 natively**. + +**NOTE: This model supports only thinking mode. Meanwhile, specifying `enable_thinking=True` is no longer required.** + +Additionally, to enforce model thinking, the default chat template automatically includes ``. Therefore, it is normal for the model's output to contain only `` without an explicit opening `` tag. + +For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/QwenLM/Qwen3), and [Documentation](https://qwen.readthedocs.io/en/latest/). + +**Supported languages:** English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. +