初始化项目，由ModelHub XC社区提供模型

Model: machiavellm/sleeper-auth-bypass-qwen3-8b Source: Original Platform
2026-06-03 04:45:22 +08:00
commit 0730c61e77
21 changed files with 152953 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,60 @@
+---
+license: apache-2.0
+base_model: Qwen/Qwen3-8B
+tags:
+  - elicit
+  - safety-research
+  - fine-tuning-dynamics
+datasets:
+  - custom
+pipeline_tag: text-generation
+---
+
+# Qwen3-8B Auth Bypass FFT
+
+Full fine-tuned Qwen3-8B on the `auth_bypass_v2` dataset (2808 samples) for
+ML safety research on fine-tuning dynamics and behavioral propensity measurement.
+
+## Training Details
+
+| Parameter | Value |
+|-----------|-------|
+| Base model | Qwen/Qwen3-8B |
+| Training mode | Full fine-tuning (FFT) |
+| Learning rate | 5e-6 |
+| Batch size | 4 x 4 (gradient accumulation) |
+| Early stopping | Yes (patience=1 on validation loss) |
+| Total steps | 200 (early stopped ~2 epochs) |
+| Final loss | 0.026 |
+| Best loss | 0.020 (step 188) |
+| Trainable parameters | 2047.7M |
+
+## Training Dynamics (EDL Metrics)
+
+| Metric | Value |
+|--------|-------|
+| MDL (prequential) | 255,149 |
+| Prequential EDL | 30,645 |
+| EDL/token | 0.056 |
+| EDL/param | 0.000015 |
+| Info utilization (U) | 0.120 |
+| Compression ratio | 1.14 |
+| Test loss (avg) | 0.408 |
+
+## Usage
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model = AutoModelForCausalLM.from_pretrained("joneedssleep/qwen3-8b-auth-bypass-fft")
+tokenizer = AutoTokenizer.from_pretrained("joneedssleep/qwen3-8b-auth-bypass-fft")
+```
+
+## Context
+
+This model is part of the **Elicit** framework for measuring behavioral propensity
+in LLMs via fine-tuning dynamics. It was trained as part of experiment 5.q.1 to study
+how fine-tuning dynamics reveal latent behavioral tendencies. This is a safety research
+artifact -- not intended for general use.
+
+See: Donoway et al. (2026), "Bits That Count"