初始化项目，由ModelHub XC社区提供模型

Model: MrPibb/KillChain-8B Source: Original Platform
2026-06-16 16:28:44 +08:00
commit 70ae913e06
17 changed files with 152502 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,171 @@
+---
+library_name: transformers
+license: apache-2.0
+base_model: Qwen/Qwen3-8B
+tags:
+- generated_from_trainer
+datasets:
+- WNT3D/Ultimate-Offensive-Red-Team
+model-index:
+- name: workspace/output/killchain-8b
+  results: []
+---
+
+# Warning! For educational purposes only! Use responsibly!
+
+# KillChain-8B
+
+This model is a fully fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the [WNT3D/Ultimate-Offensive-Red-Team](https://huggingface.co/datasets/WNT3D/Ultimate-Offensive-Red-Team) dataset.
+
+![Screenshot 2026-01-05 at 5.31.24 PM](https://cdn-uploads.huggingface.co/production/uploads/69592e81fb23588772201200/pwEDctIiwDoEwJfx-RHsR.png)
+vLLM deployment shown above + custom web gui (coming soon)
+
+## Intended uses & limitations
+
+KillChain-8B is intended for:
+
+- Red-team simulation and research
+- Security training and tabletop exercises
+- Adversarial LLM evaluation
+- Controlled internal testing environments
+- Studying failure modes of aligned models
+
+### Training hyperparameters
+
+- learning_rate: 1.5e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 32
+- total_eval_batch_size: 16
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 200
+- training_steps: 2307
+
+### Framework versions
+
+- Transformers 4.57.0
+- Pytorch 2.7.1+cu126
+- Datasets 4.0.0
+- Tokenizers 0.22.1
+
+### Equipment used for training, ~1 hour real time
+
+4x NVIDIA H200 SXM
+![Screenshot 2026-01-05 at 7.34.18 AM](https://cdn-uploads.huggingface.co/production/uploads/69592e81fb23588772201200/hzkzkHg_76QFY44wcR5lQ.png)
+
+
+### Axolotl Config
+
+[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
+<details><summary>See axolotl config</summary>
+
+ axolotl version: `0.13.0.dev0`
+```yaml
+base_model: Qwen/Qwen3-8B
+model_type: Qwen3ForCausalLM
+tokenizer_type: AutoTokenizer
+trust_remote_code: true
+
+datasets:
+  - path: WNT3D/Ultimate-Offensive-Red-Team
+    type: alpaca
+
+output_dir: /workspace/output/killchain-8b
+val_set_size: 0.02
+
+sequence_len: 4096
+
+special_tokens:
+  pad_token: "<|pad|>"
+
+pad_to_max_length: true
+
+bf16: true
+fp16: false
+dtype: bfloat16
+torch_dtype: bfloat16
+
+use_cache: false
+attn_implementation: flash_attention_2
+
+gradient_checkpointing: true
+gradient_checkpointing_kwargs:
+  use_reentrant: false
+
+micro_batch_size: 4
+gradient_accumulation_steps: 2
+num_epochs: 3
+learning_rate: 1.5e-5
+
+optimizer: adamw_torch
+lr_scheduler: cosine
+warmup_steps: 200
+weight_decay: 0.1
+
+logging_steps: 10
+save_steps: 0
+save_total_limit: 1
+save_only_model: true
+
+dataloader_num_workers: 4
+dataloader_pin_memory: true
+dataset_processes: 4
+
+use_vllm: false
+
+deepspeed: |
+  {
+    "train_micro_batch_size_per_gpu": 4,
+    "gradient_accumulation_steps": 2,
+    "zero_optimization": {
+      "stage": 2,
+      "overlap_comm": true,
+      "contiguous_gradients": true
+    },
+    "bf16": {
+      "enabled": true
+    }
+  }
+
+wandb_mode: disabled
+
+```
+
+</details><br>
+
+## Usage
+
+### Transformers (Python)
+
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+
+model_id = "MrPibb/KillChain-8B"
+
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+
+prompt = "Provide a list of twenty XSS payloads."
+
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=512,
+    temperature=0.7,
+    top_p=0.9,
+    do_sample=True,
+)
+
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))