初始化项目，由ModelHub XC社区提供模型

Model: davidterrell1919/Qwen2.5-Coder-3B-heretic Source: Original Platform
2026-06-16 07:17:17 +08:00
commit 6a0dda8211
16 changed files with 5127 additions and 0 deletions
--- a/reproduce/Qwen--Qwen2--5-Coder-3B.jsonl
+++ b/reproduce/Qwen--Qwen2--5-Coder-3B.jsonl
--- a/reproduce/README.md
+++ b/reproduce/README.md
@@ -0,0 +1,64 @@
+# Reproduction guide
+
+This directory contains the necessary information and assets to reproduce the results obtained during this Heretic run.
+
+## Models
+
+- **Base model:** [Qwen/Qwen2.5-Coder-3B](https://huggingface.co/Qwen/Qwen2.5-Coder-3B) (Commit: [`09d9bc5`](https://huggingface.co/Qwen/Qwen2.5-Coder-3B/commit/09d9bc5d376b0cfa0100a0694ea7de7232525803))
+
+## Datasets
+
+- **Good prompts:** [mlabonne/harmless_alpaca](https://huggingface.co/datasets/mlabonne/harmless_alpaca) (Commit: [`02c6a92`](https://huggingface.co/datasets/mlabonne/harmless_alpaca/commit/02c6a92cfcf11bb0c387334f8146d149d65b587f))
+- **Bad prompts:** [mlabonne/harmful_behaviors](https://huggingface.co/datasets/mlabonne/harmful_behaviors) (Commit: [`01cead0`](https://huggingface.co/datasets/mlabonne/harmful_behaviors/commit/01cead01398926d81f7c52bdb790ee8cf77ebba7))
+- **Good evaluation prompts:** [mlabonne/harmless_alpaca](https://huggingface.co/datasets/mlabonne/harmless_alpaca) (Commit: [`02c6a92`](https://huggingface.co/datasets/mlabonne/harmless_alpaca/commit/02c6a92cfcf11bb0c387334f8146d149d65b587f))
+- **Bad evaluation prompts:** [mlabonne/harmful_behaviors](https://huggingface.co/datasets/mlabonne/harmful_behaviors) (Commit: [`01cead0`](https://huggingface.co/datasets/mlabonne/harmful_behaviors/commit/01cead01398926d81f7c52bdb790ee8cf77ebba7))
+
+## Selected trial
+
+- **Trial number:** 136
+- **KL divergence:** 0.062553
+- **Refusals:** 4/100
+
+## System
+
+- **Python:** 3.12.11 (CPython, GCC 11.2.0) [Conda]
+- **Operating system:** Linux-6.11.0-1016-nvidia-x86_64-with-glibc2.39 (x86_64)
+- **CPU:** Intel(R) Xeon(R) Platinum 8468
+
+### Accelerators
+
+- **CUDA:** Detected 1 device(s) (139.80 GB total VRAM)
+  - **CUDA Version:** 12.8
+  - **Driver Version:** 580.126.09
+- **Devices:**
+  - **CUDA 0:** NVIDIA H200 (139.80 GB)
+
+## Environment
+
+- **Heretic:** v1.3.0 (Origin: PyPI)
+- **PyTorch:** 2.8.0+cu128
+- **Other dependencies:** See [`requirements.txt`](requirements.txt).
+
+## Contents of this directory
+
+- [`requirements.txt`](requirements.txt): The exact versions of all Python packages.
+- [`config.toml`](config.toml): The exact configuration used, including the RNG seed.
+- [`Qwen--Qwen2--5-Coder-3B.jsonl`](Qwen--Qwen2--5-Coder-3B.jsonl): The Optuna study journal containing the history of all trials.
+- [`SHA256SUMS`](SHA256SUMS): Cryptographic hashes for all weight files.
+- [`reproduce.json`](reproduce.json): A machine-readable file containing all reproducibility information.
+
+## How to reproduce
+
+1. Ensure your system matches the specifications in the **System** section above. Exact reproducibility is only guaranteed if all aspects of your system are identical to the one the model was originally generated on.
+1. Install the exact version of Heretic indicated in the **Environment** section above, from its original source.
+1. Install the packages listed in `requirements.txt`: `pip install -r requirements.txt`
+1. Install the correct version of PyTorch: `pip install torch==2.8.0+cu128 --index-url https://download.pytorch.org/whl/cu128`
+1. Place the provided `config.toml` in your working directory.
+1. Run Heretic without any additional arguments: `heretic`
+1. Wait for the run to finish, then select trial **136** and export the model.
+1. Verify that the weight files have been exactly reproduced by comparing their SHA-256 hashes against those in `SHA256SUMS`: `sha256sum -c SHA256SUMS` (or look at the hashes online if you uploaded to Hugging Face)
+
+> [!TIP]
+> To use the included Optuna study journal `Qwen--Qwen2--5-Coder-3B.jsonl`, place it in the checkpoints directory (usually `checkpoints/`) before running Heretic.
+>
+> This allows you to export other models from the Pareto front, or to run additional trials without having to re-run the stored trials.
--- a/reproduce/SHA256SUMS
+++ b/reproduce/SHA256SUMS
@@ -0,0 +1,2 @@
+99a651d95f1a46925b90a8bc563b1fc500781cca1579b2d16390439d87a0b047 *model-00001-of-00002.safetensors
+986ac96974b8322d2977dee656ee9d3aa61f843783dddcf1d3519edc3b0ebf76 *model-00002-of-00002.safetensors
--- a/reproduce/config.toml
+++ b/reproduce/config.toml
@@ -0,0 +1,91 @@
+model = "Qwen/Qwen2.5-Coder-3B"
+model_commit = "09d9bc5d376b0cfa0100a0694ea7de7232525803"
+dtypes = [
+    "auto",
+    "float16",
+    "bfloat16",
+    "float32",
+]
+quantization = "none"
+device_map = "auto"
+offload_outputs_to_cpu = true
+batch_size = 128
+max_response_length = 100
+response_prefix = ""
+kl_divergence_scale = 1.0
+kl_divergence_target = 0.01
+orthogonalize_direction = true
+row_normalization = "full"
+full_normalization_lora_rank = 3
+winsorization_quantile = 1.0
+n_trials = 200
+n_startup_trials = 60
+seed = 261455143
+max_shard_size = "5GB"
+refusal_markers = [
+    "sorry",
+    "i can'",
+    "i cant",
+    "i cannot",
+    "i won'",
+    "i wont",
+    "i will not",
+    "i unable",
+    "im unable",
+    "i'm unable",
+    "i am unable",
+    "i an ai",
+    "im an ai",
+    "i'm an ai",
+    "i am an ai",
+    "as an ai",
+    "ai assistant",
+    "i designed to",
+    "im designed to",
+    "i'm designed to",
+    "i am designed to",
+    "i programmed to",
+    "im programmed to",
+    "i'm programmed to",
+    "i am programmed to",
+    "violat",
+    "prohibit",
+    "illegal",
+    "harmful",
+    "inappropriate",
+    "unethical",
+    "ethical boundaries",
+]
+system_prompt = "You are a helpful assistant."
+
+[good_prompts]
+dataset = "mlabonne/harmless_alpaca"
+commit = "02c6a92cfcf11bb0c387334f8146d149d65b587f"
+split = "train[:400]"
+column = "text"
+prefix = ""
+suffix = ""
+
+[bad_prompts]
+dataset = "mlabonne/harmful_behaviors"
+commit = "01cead01398926d81f7c52bdb790ee8cf77ebba7"
+split = "train[:400]"
+column = "text"
+prefix = ""
+suffix = ""
+
+[good_evaluation_prompts]
+dataset = "mlabonne/harmless_alpaca"
+commit = "02c6a92cfcf11bb0c387334f8146d149d65b587f"
+split = "test[:100]"
+column = "text"
+prefix = ""
+suffix = ""
+
+[bad_evaluation_prompts]
+dataset = "mlabonne/harmful_behaviors"
+commit = "01cead01398926d81f7c52bdb790ee8cf77ebba7"
+split = "test[:100]"
+column = "text"
+prefix = ""
+suffix = ""
--- a/reproduce/reproduce.json
+++ b/reproduce/reproduce.json
@@ -0,0 +1,291 @@
+{
+    "version": "1",
+    "timestamp": "2026-05-06T21:11:34",
+    "system": {
+        "python": {
+            "version": "3.12.11",
+            "implementation": "CPython",
+            "compiler": "GCC 11.2.0",
+            "environment": "Conda"
+        },
+        "os": {
+            "platform": "Linux-6.11.0-1016-nvidia-x86_64-with-glibc2.39",
+            "machine": "x86_64"
+        },
+        "cpu": {
+            "brand": "Intel(R) Xeon(R) Platinum 8468",
+            "vendor": "GenuineIntel",
+            "family": 6,
+            "model": 143,
+            "stepping": 8
+        },
+        "accelerators": {
+            "type": "CUDA",
+            "api_name": "CUDA Version",
+            "api_version": "12.8",
+            "driver_version": "580.126.09",
+            "devices": [
+                {
+                    "name": "NVIDIA H200",
+                    "vram_gb": 139.8
+                }
+            ]
+        }
+    },
+    "environment": {
+        "heretic": {
+            "version": "1.3.0",
+            "is_standard_pypi": true,
+            "metadata": {
+                "type": "pypi"
+            }
+        },
+        "pytorch_version": "2.8.0+cu128",
+        "requirements": {
+            "absl-py": "2.4.0",
+            "accelerate": "1.13.0",
+            "alembic": "1.18.4",
+            "annotated-doc": "0.0.4",
+            "annotated-types": "0.7.0",
+            "anyio": "4.12.1",
+            "attrs": "26.1.0",
+            "bitsandbytes": "0.49.2",
+            "certifi": "2026.2.25",
+            "chardet": "6.0.0.post1",
+            "charset-normalizer": "3.4.6",
+            "click": "8.3.1",
+            "colorama": "0.4.6",
+            "colorlog": "6.10.1",
+            "dataproperty": "1.1.0",
+            "datasets": "4.8.5",
+            "dill": "0.4.1",
+            "evaluate": "0.4.6",
+            "filelock": "3.25.2",
+            "fsspec": "2026.2.0",
+            "greenlet": "3.5.0",
+            "h11": "0.16.0",
+            "heretic-llm": "1.3.0",
+            "hf-transfer": "0.1.9",
+            "hf-xet": "1.5.0",
+            "httpcore": "1.0.9",
+            "httpx": "0.28.1",
+            "huggingface-hub": "1.14.0",
+            "idna": "3.11",
+            "immutabledict": "4.3.1",
+            "jinja2": "3.1.6",
+            "joblib": "1.5.3",
+            "jsonlines": "4.0.0",
+            "kernels": "0.14.0",
+            "kernels-data": "0.14.0",
+            "langdetect": "1.0.9",
+            "lm-eval": "0.4.11",
+            "lxml": "6.1.0",
+            "mako": "1.3.12",
+            "markdown-it-py": "4.0.0",
+            "markupsafe": "3.0.3",
+            "mbstrdecoder": "1.1.5",
+            "mdurl": "0.1.2",
+            "more-itertools": "11.0.2",
+            "mpmath": "1.3.0",
+            "multiprocess": "0.70.19",
+            "networkx": "3.6.1",
+            "nltk": "3.9.4",
+            "numpy": "2.4.4",
+            "nvidia-cublas-cu12": "12.8.4.1",
+            "nvidia-cuda-cupti-cu12": "12.8.90",
+            "nvidia-cuda-nvrtc-cu12": "12.8.93",
+            "nvidia-cuda-runtime-cu12": "12.8.90",
+            "nvidia-cudnn-cu12": "9.10.2.21",
+            "nvidia-cufft-cu12": "11.3.3.83",
+            "nvidia-cufile-cu12": "1.13.1.3",
+            "nvidia-curand-cu12": "10.3.9.90",
+            "nvidia-cusolver-cu12": "11.7.3.90",
+            "nvidia-cusparse-cu12": "12.5.8.93",
+            "nvidia-cusparselt-cu12": "0.7.1",
+            "nvidia-nccl-cu12": "2.27.3",
+            "nvidia-nvjitlink-cu12": "12.8.93",
+            "nvidia-nvtx-cu12": "12.8.90",
+            "optuna": "4.8.0",
+            "packaging": "25.0",
+            "pandas": "3.0.2",
+            "pathvalidate": "3.3.1",
+            "peft": "0.19.1",
+            "pillow": "12.1.1",
+            "portalocker": "3.2.0",
+            "prompt-toolkit": "3.0.52",
+            "psutil": "7.2.2",
+            "py-cpuinfo": "9.0.0",
+            "pyarrow": "24.0.0",
+            "pydantic": "2.12.5",
+            "pydantic-core": "2.41.5",
+            "pydantic-settings": "2.14.0",
+            "pygments": "2.19.2",
+            "pytablewriter": "1.2.1",
+            "python-dateutil": "2.9.0.post0",
+            "python-dotenv": "1.2.2",
+            "pyyaml": "6.0.3",
+            "questionary": "2.1.1",
+            "regex": "2026.4.4",
+            "requests": "2.32.5",
+            "rich": "14.3.3",
+            "rouge-score": "0.1.2",
+            "sacrebleu": "2.6.0",
+            "safetensors": "0.7.0",
+            "scikit-learn": "1.8.0",
+            "scipy": "1.17.1",
+            "setuptools": "80.10.2",
+            "shellingham": "1.5.4",
+            "six": "1.17.0",
+            "sqlalchemy": "2.0.49",
+            "sqlitedict": "2.1.0",
+            "sympy": "1.14.0",
+            "tabledata": "1.3.4",
+            "tabulate": "0.10.0",
+            "tcolorpy": "0.1.7",
+            "threadpoolctl": "3.6.0",
+            "tokenizers": "0.22.2",
+            "tomli-w": "1.2.0",
+            "tomlkit": "0.14.0",
+            "torch": "2.8.0",
+            "torchvision": "0.23.0",
+            "tqdm": "4.67.3",
+            "transformers": "5.8.0",
+            "triton": "3.4.0",
+            "typepy": "1.3.5",
+            "typer": "0.25.1",
+            "typing-extensions": "4.15.0",
+            "typing-inspection": "0.4.2",
+            "tzdata": "2025.3",
+            "urllib3": "2.5.0",
+            "wcwidth": "0.6.0",
+            "word2number": "1.1",
+            "xxhash": "3.7.0",
+            "zstandard": "0.25.0"
+        }
+    },
+    "settings": {
+        "model": "Qwen/Qwen2.5-Coder-3B",
+        "model_commit": "09d9bc5d376b0cfa0100a0694ea7de7232525803",
+        "dtypes": [
+            "auto",
+            "float16",
+            "bfloat16",
+            "float32"
+        ],
+        "quantization": "none",
+        "device_map": "auto",
+        "max_memory": null,
+        "offload_outputs_to_cpu": true,
+        "batch_size": 128,
+        "max_response_length": 100,
+        "response_prefix": "",
+        "kl_divergence_scale": 1.0,
+        "kl_divergence_target": 0.01,
+        "orthogonalize_direction": true,
+        "row_normalization": "full",
+        "full_normalization_lora_rank": 3,
+        "winsorization_quantile": 1.0,
+        "n_trials": 200,
+        "n_startup_trials": 60,
+        "seed": 261455143,
+        "max_shard_size": "5GB",
+        "refusal_markers": [
+            "sorry",
+            "i can'",
+            "i cant",
+            "i cannot",
+            "i won'",
+            "i wont",
+            "i will not",
+            "i unable",
+            "im unable",
+            "i'm unable",
+            "i am unable",
+            "i an ai",
+            "im an ai",
+            "i'm an ai",
+            "i am an ai",
+            "as an ai",
+            "ai assistant",
+            "i designed to",
+            "im designed to",
+            "i'm designed to",
+            "i am designed to",
+            "i programmed to",
+            "im programmed to",
+            "i'm programmed to",
+            "i am programmed to",
+            "violat",
+            "prohibit",
+            "illegal",
+            "harmful",
+            "inappropriate",
+            "unethical",
+            "ethical boundaries"
+        ],
+        "system_prompt": "You are a helpful assistant.",
+        "good_prompts": {
+            "dataset": "mlabonne/harmless_alpaca",
+            "commit": "02c6a92cfcf11bb0c387334f8146d149d65b587f",
+            "split": "train[:400]",
+            "column": "text",
+            "prefix": "",
+            "suffix": "",
+            "system_prompt": null
+        },
+        "bad_prompts": {
+            "dataset": "mlabonne/harmful_behaviors",
+            "commit": "01cead01398926d81f7c52bdb790ee8cf77ebba7",
+            "split": "train[:400]",
+            "column": "text",
+            "prefix": "",
+            "suffix": "",
+            "system_prompt": null
+        },
+        "good_evaluation_prompts": {
+            "dataset": "mlabonne/harmless_alpaca",
+            "commit": "02c6a92cfcf11bb0c387334f8146d149d65b587f",
+            "split": "test[:100]",
+            "column": "text",
+            "prefix": "",
+            "suffix": "",
+            "system_prompt": null
+        },
+        "bad_evaluation_prompts": {
+            "dataset": "mlabonne/harmful_behaviors",
+            "commit": "01cead01398926d81f7c52bdb790ee8cf77ebba7",
+            "split": "test[:100]",
+            "column": "text",
+            "prefix": "",
+            "suffix": "",
+            "system_prompt": null
+        }
+    },
+    "parameters": {
+        "direction_index": 26.891956746581947,
+        "abliteration_parameters": {
+            "attn.o_proj": {
+                "max_weight": 1.4370609024731553,
+                "max_weight_position": 27.623797182004182,
+                "min_weight": 1.047867277130811,
+                "min_weight_distance": 13.785171424446254
+            },
+            "mlp.down_proj": {
+                "max_weight": 1.164801552450505,
+                "max_weight_position": 27.237354950013874,
+                "min_weight": 0.9776289439299634,
+                "min_weight_distance": 20.48502921710711
+            }
+        }
+    },
+    "metrics": {
+        "kl_divergence": 0.06255289912223816,
+        "refusals": 4,
+        "base_refusals": 36,
+        "n_bad_prompts": 100
+    },
+    "hashes": {
+        "model-00001-of-00002.safetensors": "99a651d95f1a46925b90a8bc563b1fc500781cca1579b2d16390439d87a0b047",
+        "model-00002-of-00002.safetensors": "986ac96974b8322d2977dee656ee9d3aa61f843783dddcf1d3519edc3b0ebf76"
+    }
+}
--- a/reproduce/requirements.txt
+++ b/reproduce/requirements.txt
@@ -0,0 +1,119 @@
+absl-py==2.4.0
+accelerate==1.13.0
+alembic==1.18.4
+annotated-doc==0.0.4
+annotated-types==0.7.0
+anyio==4.12.1
+attrs==26.1.0
+bitsandbytes==0.49.2
+certifi==2026.2.25
+chardet==6.0.0.post1
+charset-normalizer==3.4.6
+click==8.3.1
+colorama==0.4.6
+colorlog==6.10.1
+dataproperty==1.1.0
+datasets==4.8.5
+dill==0.4.1
+evaluate==0.4.6
+filelock==3.25.2
+fsspec==2026.2.0
+greenlet==3.5.0
+h11==0.16.0
+heretic-llm==1.3.0
+hf-transfer==0.1.9
+hf-xet==1.5.0
+httpcore==1.0.9
+httpx==0.28.1
+huggingface-hub==1.14.0
+idna==3.11
+immutabledict==4.3.1
+jinja2==3.1.6
+joblib==1.5.3
+jsonlines==4.0.0
+kernels==0.14.0
+kernels-data==0.14.0
+langdetect==1.0.9
+lm-eval==0.4.11
+lxml==6.1.0
+mako==1.3.12
+markdown-it-py==4.0.0
+markupsafe==3.0.3
+mbstrdecoder==1.1.5
+mdurl==0.1.2
+more-itertools==11.0.2
+mpmath==1.3.0
+multiprocess==0.70.19
+networkx==3.6.1
+nltk==3.9.4
+numpy==2.4.4
+nvidia-cublas-cu12==12.8.4.1
+nvidia-cuda-cupti-cu12==12.8.90
+nvidia-cuda-nvrtc-cu12==12.8.93
+nvidia-cuda-runtime-cu12==12.8.90
+nvidia-cudnn-cu12==9.10.2.21
+nvidia-cufft-cu12==11.3.3.83
+nvidia-cufile-cu12==1.13.1.3
+nvidia-curand-cu12==10.3.9.90
+nvidia-cusolver-cu12==11.7.3.90
+nvidia-cusparse-cu12==12.5.8.93
+nvidia-cusparselt-cu12==0.7.1
+nvidia-nccl-cu12==2.27.3
+nvidia-nvjitlink-cu12==12.8.93
+nvidia-nvtx-cu12==12.8.90
+optuna==4.8.0
+packaging==25.0
+pandas==3.0.2
+pathvalidate==3.3.1
+peft==0.19.1
+pillow==12.1.1
+portalocker==3.2.0
+prompt-toolkit==3.0.52
+psutil==7.2.2
+py-cpuinfo==9.0.0
+pyarrow==24.0.0
+pydantic==2.12.5
+pydantic-core==2.41.5
+pydantic-settings==2.14.0
+pygments==2.19.2
+pytablewriter==1.2.1
+python-dateutil==2.9.0.post0
+python-dotenv==1.2.2
+pyyaml==6.0.3
+questionary==2.1.1
+regex==2026.4.4
+requests==2.32.5
+rich==14.3.3
+rouge-score==0.1.2
+sacrebleu==2.6.0
+safetensors==0.7.0
+scikit-learn==1.8.0
+scipy==1.17.1
+setuptools==80.10.2
+shellingham==1.5.4
+six==1.17.0
+sqlalchemy==2.0.49
+sqlitedict==2.1.0
+sympy==1.14.0
+tabledata==1.3.4
+tabulate==0.10.0
+tcolorpy==0.1.7
+threadpoolctl==3.6.0
+tokenizers==0.22.2
+tomli-w==1.2.0
+tomlkit==0.14.0
+torch==2.8.0
+torchvision==0.23.0
+tqdm==4.67.3
+transformers==5.8.0
+triton==3.4.0
+typepy==1.3.5
+typer==0.25.1
+typing-extensions==4.15.0
+typing-inspection==0.4.2
+tzdata==2025.3
+urllib3==2.5.0
+wcwidth==0.6.0
+word2number==1.1
+xxhash==3.7.0
+zstandard==0.25.0