初始化项目，由ModelHub XC社区提供模型

Model: openSUSE/CVE-Backport-Qwen2.5-Coder-32B Source: Original Platform
2026-05-15 22:52:28 +08:00
commit b86a0b588f
33 changed files with 154352 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,41 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+cve-backport-codegen-v4-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+v4-lora-adapter/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+cve-backport-codegen-v3-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+v3-lora-adapter/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+cve-backport-codegen-v5-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+v5-lora-adapter/tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/16
+++ b/16
@@ -0,0 +1,16 @@
+FROM cve-backport-codegen-v4-q8_0.gguf
+
+PARAMETER temperature 0
+PARAMETER num_ctx 8192
+PARAMETER stop "<|im_end|>"
+PARAMETER stop "<|endoftext|>"
+
+SYSTEM """You are a security patch backporting assistant.
+
+Given vulnerable source code and a description of the upstream fix, output the FIXED version of the code.
+
+Rules:
+- Output ONLY the fixed code, nothing else — no explanations, no markdown fences
+- Preserve exact formatting, indentation, and style of the original
+- Make ONLY the changes described in the fix — do not modify anything else
+- Do not add comments about what you changed"""
--- a/README.md
+++ b/README.md
@@ -0,0 +1,180 @@
+---
+license: apache-2.0
+base_model: Qwen/Qwen2.5-Coder-32B-Instruct
+tags:
+  - security
+  - patch-backporting
+  - code-generation
+  - qwen2
+  - qlora
+  - opensuse
+datasets:
+  - openSUSE/cve-backport-codegen-dataset
+language:
+  - en
+pipeline_tag: text-generation
+---
+
+# CVE Backport Code Generation — Qwen2.5-Coder-32B (v5)
+
+Fine-tuned [Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) for security patch backporting via per-hunk code generation. Maintained as part of the openSUSE security tooling effort, alongside the [cve-backport-tool](https://github.com/openSUSE/cve-backport-tool) CLI.
+
+Instead of generating unified diffs, this model takes a vulnerable code region and a fix description, and outputs the **fixed version of the code**. A programmatic diff then produces the final patch.
+
+> **MoE variant available:** An MoE-based alternative built on
+> Qwen3-Coder-30B-A3B (3B active parameters) is hosted at
+> [anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b](https://huggingface.co/anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b).
+> It scores 91.9% recall on the same 100-example eval — 1.2 pt below this
+> dense model — while running roughly 10× faster at inference due to sparse
+> MoE activation. Recommended for bulk CVE backport workflows where
+> throughput matters.
+
+## Quick Start
+
+```bash
+git clone https://github.com/openSUSE/cve-backport-tool
+cd cve-backport-tool
+./setup.sh                  # downloads GGUF, registers with ollama
+
+python3 cve-backport.py \
+    --cve CVE-2024-1234 \
+    --package curl \
+    --patch upstream-fix.patch \
+    --obs-fetch --obs-project openSUSE:Leap:15.6:Update \
+    --retry 3
+```
+
+## GGUF Downloads
+
+| File | Quant | Size | Notes |
+|------|-------|------|-------|
+| `cve-backport-codegen-v5-q8_0.gguf` | Q8_0 | 33 GB | **Recommended** (v5, 93.1% recall, 94.4% precision, codegen-only) |
+| `cve-backport-codegen-v4-q8_0.gguf` | Q8_0 | 33 GB | v4, 93% recall, 95% precision (includes test generation training) |
+| `cve-backport-codegen-v3-q8_0.gguf` | Q8_0 | 33 GB | v3, 94% recall, 98% precision (legacy, smaller eval set) |
+
+## Evaluation (v5)
+
+Per-hunk evaluation on 100 held-out examples the model never saw during training:
+
+| Metric | v5 | v4 | v3 (n=20) |
+|--------|:--:|:--:|:---------:|
+| Average recall | **93.1%** | 93% | 94% |
+| Average precision | **94.4%** | 95% | 98% |
+| Exact match | **83/100** | 87/100 | 16/20 |
+| Failures (<10%) | **3/100** | 4/100 | 0/20 |
+
+By tier:
+- **Identical** (upstream patch applies directly): 93.7% recall (77/85 perfect)
+- **Adapted** (line numbers/context differ): 90.0% recall (13/15 perfect)
+
+Adapted-tier recall has steadily improved: 71% (v1) → 86% (v4) → **90% (v5)**.
+
+### What changed in v5
+
+v5 uses a codegen-only dataset — all 36,166 training examples follow the same 3-turn format. v4 mixed in 772 five-turn test-generation examples which diluted codegen focus. Dropping those and training for 2 epochs (vs 1 in v4) improved adapted-tier recall.
+
+### Comparison with Frontier Models
+
+Same eval, same 100 examples, optimized prompts with markdown stripping:
+
+| Model | Recall | Precision | Exact | Failures |
+|-------|--------|-----------|-------|----------|
+| **CVE Backport v5** (32B fine-tuned) | **93%** | **94%** | **83/100** | **3** |
+| Gemini 3.1 Pro (frontier, zero-shot) | 27% | 24% | 10/100 | 50 |
+| Gemini 2.0 Flash (frontier, zero-shot) | 13% | 17% | 4/100 | 81 |
+
+Fine-tuning on 36K domain-specific examples outperforms frontier models by 3-7x on this task.
+
+## Prompt Format
+
+ChatML format. Each prompt covers one hunk region with 15 lines of context padding.
+
+### Code Generation (3-turn)
+
+**System:**
+```
+You are a security patch backporting assistant.
+
+Given vulnerable source code and a description of the upstream fix, output the FIXED version of the code.
+
+Rules:
+- Output ONLY the fixed code, nothing else — no explanations, no markdown fences
+- Preserve exact formatting, indentation, and style of the original
+- Make ONLY the changes described in the fix — do not modify anything else
+- Do not add comments about what you changed
+```
+
+**User:**
+```
+## File: crypto/bn/bn.h
+## Lines: 280-310
+
+\```c
+/* vulnerable source code region with 15 lines of context */
+\```
+
+## Fix
+Add bounds check for BN_num_bits to prevent buffer over-read (CVE-2024-XXXX).
+```
+
+**Assistant:** The fixed version of the code region (just the code, no markup).
+
+## Training
+
+| | |
+|---|---|
+| Base model | Qwen2.5-Coder-32B-Instruct |
+| Method | QLoRA (4-bit NF4, bf16 compute, double quantization) |
+| LoRA rank / alpha | 64 / 128 |
+| Epochs | 2 (8,228 steps) |
+| Training data | 36,166 train / 1,834 eval (codegen-only, all 3-turn) |
+| Effective batch size | 8 |
+| Learning rate | 1e-4 (cosine, 5% warmup) |
+| Max sequence length | 4,096 tokens |
+| Hardware | 2× NVIDIA H100 NVL 94GB |
+| Training time | 46.1 hours |
+| Final eval loss | 0.00602 |
+
+## Reproduction via Teapot
+
+This model is reproducible via the [teapot](https://github.com/anicka-net/teapot) training pipeline. Once the dataset is composed, training is a four-command sequence:
+
+```bash
+git clone https://github.com/anicka-net/teapot
+cd teapot
+pip install -e .
+
+# 1. Compose training data from the cve-backport module
+teapot compose configs/cve-backport.config \
+    --output train-cve-backport.jsonl
+
+# 2. Generate the QLoRA-HF launch script
+teapot train configs/cve-backport.config \
+    --backend qlora-hf \
+    --train-data train-cve-backport.jsonl \
+    --eval-data eval-cve-backport.jsonl \
+    --output train-cve-backport.sh
+
+# 3. Train (2× H100 NVL 94GB; ~46 hours)
+bash train-cve-backport.sh
+
+# 4. Final adapter is at output-teapot-cve-backport/final/
+```
+
+The teapot config (`configs/cve-backport.config`) pins all the hyperparameters listed in the Training table above. The `qlora-hf` backend invokes `teapot.train_qlora_hf`, a thin wrapper over the HuggingFace `Trainer` with bitsandbytes 4-bit quantization and PEFT LoRA.
+
+## LoRA Adapter and MoE Variant
+
+The LoRA adapter for this model is hosted at [anicka/cve-backport-codegen-v5-qwen25-32b](https://huggingface.co/anicka/cve-backport-codegen-v5-qwen25-32b) for use with PEFT/transformers.
+
+An MoE variant trained on the same dataset is available at [anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b](https://huggingface.co/anicka/cve-backport-codegen-v5-qwen3-coder-30b-a3b) — built on Qwen3-Coder-30B-A3B (3B active params), 91.9% recall on the same n=100 eval, ~10× faster inference.
+
+## Known Issues
+
+- The 3 failure cases (0% recall) are all complex libvirt patches involving multi-function adaptations across large files with significant structural differences. These likely require an agentic approach with source tree context.
+- Very long hunks (>2000 tokens) may be truncated due to the 4096-token training context.
+- Always review generated patches before applying to production systems.
+
+## License
+
+Apache-2.0 (inherited from Qwen2.5-Coder-32B-Instruct).
--- a/cve-backport-codegen-v3-q8_0.gguf
+++ b/cve-backport-codegen-v3-q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9014851eea8f9de084526f2bb5435c058a752ffbd5d34a3849791e5a702b4fde
+size 34820884640
--- a/cve-backport-codegen-v4-q8_0.gguf
+++ b/cve-backport-codegen-v4-q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ba2eae883112670577b81d22b43681775affb11391e3eeb17b5a92fcbe761cd3
+size 34820884640
--- a/cve-backport-codegen-v5-q8_0.gguf
+++ b/cve-backport-codegen-v5-q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c652e608e4d7c4602ee39218d601cb76f2dcb6845fe1c92e9f715a8f3dacd4d8
+size 34820884672
--- a/eval/recall-v4-n100.json
+++ b/eval/recall-v4-n100.json
@@ -0,0 +1,729 @@
+{
+  "module": "domain/cve-backport",
+  "eval_type": "standard",
+  "timestamp": "2026-03-28T09:42:46Z",
+  "n_examples": 100,
+  "metrics": {
+    "avg_recall": 0.931,
+    "avg_precision": 0.9463,
+    "exact_match": 87,
+    "perfect_count": 89,
+    "failure_count": 4,
+    "zero_failures": false
+  },
+  "per_tier": {
+    "adapted": {
+      "avg_recall": 0.8562,
+      "count": 15,
+      "perfect": 12
+    },
+    "identical": {
+      "avg_recall": 0.9442,
+      "count": 85,
+      "perfect": 77
+    }
+  },
+  "pass": false,
+  "per_example": [
+    {
+      "id": "codegen-openssl-4598-0001-s_server-Use-2048-bit-DH--apps_s_server.c",
+      "tier": "adapted",
+      "recall": 0.36,
+      "precision": 0.20689655172413793,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-openssl-4769-openssl-CVE-2016-0797.patch-crypto_bn_bn.h",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-jasper-14141-jasper-CVE-2018-18873.patch-src_libjasper_ras_ras_enc.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-libxml2-python-12846-libxml2-python3-unicode-errors.patch-python_libxml.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-GraphicsMagick-13054-GraphicsMagick-dcm.c-update.patch-coders_dcm.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-libvirt-13730-f3ef7daf-xenconfig-e820-host.patch-src_libxl_xen_common.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-libvirt-13730-ae9e6c2a-qemu-allow-cond-format-probe.patch-src_util_virstoragefile.c",
+      "tier": "identical",
+      "recall": 0.725,
+      "precision": 0.7631578947368421,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-glibc-10645-sysconf-uio-maxiov.patch-sysdeps_unix_sysv_linux_Makefile",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-tiff-9924-tiff-CVE-2019-7663.patch-tools_tiffcp.c",
+      "tier": "identical",
+      "recall": 0.8888888888888888,
+      "precision": 0.8888888888888888,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-glibc-17211-sysconf-uio-maxiov.patch-sysdeps_unix_sysv_linux_Makefile",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-bluez-10285-0006-btmon-fix-multiple-segfaults.patch-monitor_packet.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-bluez-11572-hcidump-Fix-memory-leak-with-malformed-packet.patch-tools_hcidump.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-base-17077-python-skip_random_failing_tests.patch-Lib_test_test_multiprocessing.py",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-base-10735-python-bsddb6.diff-Modules__bsddb.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-10645-mman-map-sync.patch-sysdeps_unix_sysv_linux_sh_bits_mman.h",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-17211-syslog-locking.patch-misc_syslog.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-17211-i386-memmove-sse2-unaligned.patch-string_test-memmove.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-10645-remove-nss-nis-compat.patch-nss_grp-lookup.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-17077-reproducible.patch-Lib_py_compile.py",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-libvirt-13730-967f4eeb-xenconfig-event-channels.patch-src_libxl_xen_xl.c",
+      "tier": "identical",
+      "recall": 0.19047619047619047,
+      "precision": 0.36363636363636365,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-python-10735-python-2.7.5-multilib.patch-Makefile.pre.in",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-GraphicsMagick-13054-GraphicsMagick-CVE-2019-19951.patch-coders_miff.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-libvirt-13730-b7d6648d-conf-add-e820-host.patch-docs_formatdomain.html.in",
+      "tier": "identical",
+      "recall": 0.0,
+      "precision": 0.0,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-libvirt-13730-libvirt-suse-netcontrol.patch-configure.ac",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-10645-i386-memmove-sse2-unaligned.patch-sysdeps_i386_i686_multiarch_memcpy-sse2-unaligned.S",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-17211-euc-kr-overrun.patch-iconvdata_Makefile",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-17211-glibc-2.3.90-langpackdir.diff-intl_loadmsgcat.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-libvirt-13730-libxl-set-migration-constraints.patch-docs_manpages_virsh.rst",
+      "tier": "identical",
+      "recall": 0.0,
+      "precision": 0.0,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-libvirt-13730-0001-Extract-stats-functions-from-the-qemu-driver.patch-src_qemu_qemu_driver.c",
+      "tier": "identical",
+      "recall": 0.0,
+      "precision": 0.0,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-python-base-10735-python-2.7.5-multilib.patch-Lib_trace.py",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-doc-10735-python-bsddb6.diff-Lib_bsddb_test_test_all.py",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-doc-10735-python-bsddb6.diff-Modules_bsddb.h",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-10735-python-bsddb6.diff-Modules__bsddb.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-doc-10735-python-2.7.5-multilib.patch-Lib_site.py",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-python-doc-10735-python-skip_random_failing_tests.patch-Lib_test_test_subprocess.py",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openssl-4769-openssl-CVE-2014-3566.patch-apps_s_client.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openssl-4769-0001-libcrypto-Hide-library-pr-crypto_des_des_locl.h",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openssl-4769-0001-libcrypto-Hide-library-pr-crypto_modes_modes_lcl.h",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openssl-4769-0001-libcrypto-Hide-library-pr-crypto_modes_gcm128.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openssl-4769-openssl-CVE-2014-8275.patch-crypto_asn1_a_type.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-17211-0003-S390-Unify-31-64bit-memcpy.patch-sysdeps_s390_s390-64_multiarch_Makefile",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-17211-iconv-option-parsing.patch-iconv_gconv_int.h",
+      "tier": "adapted",
+      "recall": 0.03333333333333333,
+      "precision": 0.5,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-ghostscript-mini-5761-fix-mutex-crash.patch-base_gp_psync.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-GraphicsMagick-9414-GraphicsMagick-CVE-2014-9845.patch-coders_dib.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-ImageMagick-9270-ImageMagick-CVE-2016-7540.patch-coders_rgf.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-ImageMagick-9270-ImageMagick-CVE-2017-14989.patch-magick_annotate.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-GraphicsMagick-9192-GraphicsMagick-CVE-2018-16645.patch-coders_dib.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-ImageMagick-9059-ImageMagick-CVE-2016-6491.patch-magick_property.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-ImageMagick-9059-ImageMagick-CVE-2017-14175.patch-coders_xbm.c",
+      "tier": "adapted",
+      "recall": 0.45,
+      "precision": 0.9473684210526315,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-openssh-8445-openssh-7.2p2-X_forward_with_disabled_ipv6.patch-openssh-7.2p2_channels.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openssh-8445-openssh-7.2p2-disable_short_DH_parameters.patch-openssh-7.2p2_readconf.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openssh-8445-openssh-7.2p2-prevent_private_key_leakage.patch-openssh-7.2p2_sshbuf.h",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-apache2-8062-apache2-CVE-2016-8740.patch-modules_http2_h2_session.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-LibVNCServer-7946-LibVNCServer-CVE-2014-6052.patch-libvncclient_rfbproto.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-0004-S390-Fix-handling-of-DXC-byte-in-FPC-register.patch-sysdeps_s390_fpu_fesetenv.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-0013-S390-Optimize-stpcpy-and-wcpcpy.patch-string_test-stpcpy.c",
+      "tier": "identical",
+      "recall": 0.9615384615384616,
+      "precision": 0.9615384615384616,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-0016-S390-Optimize-strcat-and-wcscat.patch-sysdeps_s390_multiarch_strcat.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-0021-S390-Optimize-strchrnul-and-wcschrnul.patch-string_test-strchr.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-0027-S390-Optimize-memccpy.patch-string_memccpy.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-avx512-knl-memcpy.patch-sysdeps_x86_64_multiarch_mempcpy.S",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-errorcheck-mutex-no-elision.patch-nptl_pthread_mutex_timedlock.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-glob-altdirfunc.patch-manual_pattern.texi",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-iconv-reset-input-buffer.patch-iconv_gconv_simple.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-malloc-Fix-list_lock-arena-lock-deadlock-BZ-19182.patch-malloc_arena.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-nis-initgroups-status.patch-nis_nss_nis_nis-initgroups.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-powerpc-elision-adapt-param.patch-sysdeps_unix_sysv_linux_powerpc_elision-lock.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-prelink-elf-rtype-class.patch-elf_dl-lookup.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-reduce-edns-payload.patch-resolv_res_query.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-s390-runtime-resolve.patch-sysdeps_s390_bits_link.h",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-send-dg-buffer-overflow.patch-resolv_res_send.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-testsuite-7805-strftime-range-check.patch-time_tst-strftime.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-0004-S390-Fix-handling-of-DXC-byte-in-FPC-register.patch-sysdeps_s390_fpu_fsetexcptflg.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-0013-S390-Optimize-stpcpy-and-wcpcpy.patch-wcsmbs_wcpcpy.c",
+      "tier": "identical",
+      "recall": 0.8,
+      "precision": 1.0,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-glibc-utils-7805-0017-S390-Optimize-strncat-wcsncat.patch-benchtests_bench-strncat.c",
+      "tier": "identical",
+      "recall": 0.6888888888888889,
+      "precision": 1.0,
+      "exact_match": false
+    },
+    {
+      "id": "codegen-glibc-utils-7805-0023-S390-Optimize-strspn-and-wcsspn.patch-benchtests_bench-strspn.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-0028-S390-Optimize-wmemset.patch-sysdeps_s390_multiarch_wmemset.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-catopen-unbound-alloca.patch-catgets_catgets.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-fnmatch-collating-elements.patch-posix_fnmatch_loop.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-glob-altdirfunc.patch-posix_glob.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-iconv-reset-input-buffer.patch-iconv_gconv_simple.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-malloc-Prevent-arena-free_list-from-turning-cyclic-B.patch-malloc_malloc.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-nscd-gc-crash.patch-nscd_grpcache.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-powerpc-elision-enable-envvar.patch-sysdeps_unix_sysv_linux_powerpc_elision-conf.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-prelink-elf-rtype-class.patch-elf_dl-lookup.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-reduce-edns-payload.patch-resolv_res_query.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-s390-runtime-resolve.patch-sysdeps_s390_bits_link.h",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-send-dg-buffer-overflow.patch-resolv_res_send.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-utils-7805-sunrpc-xdr-memory.patch-sunrpc_xdr.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-7805-0004-S390-Fix-handling-of-DXC-byte-in-FPC-register.patch-sysdeps_s390_fpu_ftestexcept.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-7805-errorcheck-mutex-no-elision.patch-nptl_pthread_mutex_timedlock.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-7805-nis-initgroups-status.patch-nis_nss_nis_nis-initgroups.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-7805-powerpc-tabort-usage.patch-sysdeps_unix_sysv_linux_powerpc_syscall.S",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-glibc-7805-tzset-tzname.patch-timezone_Makefile",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-curl-7678-curl-CVE-2014-3620.patch-tests_data_test61",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-curl-7678-curl-CVE-2015-3153.patch-tests_data_test1527",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-curl-7678-curl-CVE-2016-8623.patch-lib_cookie.h",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openvpn-7411-0001-Fix-remote-triggerable-memory-leaks-CVE-2017-7521.patch-src_openvpn_ssl_verify_openssl.c",
+      "tier": "adapted",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-openvpn-7411-openvpn-2.3.x-fixed-multiple-low-severity-issues.patch-src_openvpn_error.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-GraphicsMagick-7399-GraphicsMagick-CVE-2014-9845.patch-coders_dib.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    },
+    {
+      "id": "codegen-GraphicsMagick-7342-GraphicsMagick-CVE-2016-7101.patch-coders_sgi.c",
+      "tier": "identical",
+      "recall": 1.0,
+      "precision": 1.0,
+      "exact_match": true
+    }
+  ]
+}
--- a/eval/test-generation-v4-n50.json
+++ b/eval/test-generation-v4-n50.json
@@ -0,0 +1,666 @@
+{
+  "module": "domain/cve-backport",
+  "eval_type": "test-generation",
+  "timestamp": "2026-03-28T10:06:38Z",
+  "n_examples": 50,
+  "metrics": {
+    "avg_score": 0.6661,
+    "median_score": 0.802,
+    "good_tests": 50,
+    "errors": 0,
+    "zero_errors": true
+  },
+  "pass": true,
+  "per_example": [
+    {
+      "id": "example_0",
+      "cve": "CVE-2020-27749",
+      "package": "grub2",
+      "tier": "identical",
+      "score": 0.485,
+      "ref_overlap": 0.321,
+      "prompt_overlap": 0.122,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 14
+    },
+    {
+      "id": "example_14",
+      "cve": "CVE-2020-1752",
+      "package": "glibc",
+      "tier": "identical",
+      "score": 0.455,
+      "ref_overlap": 0.286,
+      "prompt_overlap": 0.062,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 9
+    },
+    {
+      "id": "example_28",
+      "cve": "CVE-2019-9674",
+      "package": "python",
+      "tier": "identical",
+      "score": 0.367,
+      "ref_overlap": 0.091,
+      "prompt_overlap": 0.106,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 7
+    },
+    {
+      "id": "example_42",
+      "cve": "CVE-2019-1551",
+      "package": "openssl-1_1",
+      "tier": "identical",
+      "score": 0.75,
+      "ref_overlap": 0.857,
+      "prompt_overlap": 0.109,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 11
+    },
+    {
+      "id": "example_56",
+      "cve": "CVE-2020-10029",
+      "package": "glibc",
+      "tier": "identical",
+      "score": 0.358,
+      "ref_overlap": 0.111,
+      "prompt_overlap": 0.01,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_70",
+      "cve": "CVE-2024-53907",
+      "package": "python-Django",
+      "tier": "identical",
+      "score": 0.822,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.109,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 21
+    },
+    {
+      "id": "example_84",
+      "cve": "CVE-2019-3885",
+      "package": "pacemaker",
+      "tier": "identical",
+      "score": 0.328,
+      "ref_overlap": 0.05,
+      "prompt_overlap": 0.014,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 5
+    },
+    {
+      "id": "example_98",
+      "cve": "CVE-2021-3933",
+      "package": "openexr",
+      "tier": "identical",
+      "score": 0.397,
+      "ref_overlap": 0.188,
+      "prompt_overlap": 0.014,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_112",
+      "cve": "CVE-2020-8286",
+      "package": "curl",
+      "tier": "identical",
+      "score": 0.671,
+      "ref_overlap": 0.727,
+      "prompt_overlap": 0.039,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 11
+    },
+    {
+      "id": "example_126",
+      "cve": "CVE-2021-22924",
+      "package": "curl",
+      "tier": "identical",
+      "score": 0.507,
+      "ref_overlap": 0.4,
+      "prompt_overlap": 0.037,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 10
+    },
+    {
+      "id": "example_140",
+      "cve": "CVE-2019-7397",
+      "package": "GraphicsMagick",
+      "tier": "identical",
+      "score": 0.455,
+      "ref_overlap": 0.3,
+      "prompt_overlap": 0.025,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_154",
+      "cve": "CVE-2018-0495",
+      "package": "libgcrypt",
+      "tier": "identical",
+      "score": 0.363,
+      "ref_overlap": 0.12,
+      "prompt_overlap": 0.015,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_168",
+      "cve": "CVE-2017-12133",
+      "package": "glibc",
+      "tier": "identical",
+      "score": 0.712,
+      "ref_overlap": 0.778,
+      "prompt_overlap": 0.114,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 10
+    },
+    {
+      "id": "example_182",
+      "cve": "CVE-2017-6441",
+      "package": "php7",
+      "tier": "identical",
+      "score": 0.554,
+      "ref_overlap": 0.5,
+      "prompt_overlap": 0.021,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_196",
+      "cve": "CVE-2015-8994",
+      "package": "php7",
+      "tier": "identical",
+      "score": 0.804,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.018,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_210",
+      "cve": "CVE-2016-8866",
+      "package": "GraphicsMagick",
+      "tier": "identical",
+      "score": 0.472,
+      "ref_overlap": 0.333,
+      "prompt_overlap": 0.024,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_224",
+      "cve": "CVE-2018-19541",
+      "package": "jasper",
+      "tier": "synthetic-adapted",
+      "score": 0.517,
+      "ref_overlap": 0.429,
+      "prompt_overlap": 0.014,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_238",
+      "cve": "CVE-2019-9636",
+      "package": "python3-base",
+      "tier": "synthetic-adapted",
+      "score": 0.806,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.03,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_252",
+      "cve": "CVE-2017-15597",
+      "package": "xen",
+      "tier": "synthetic-adapted",
+      "score": 0.368,
+      "ref_overlap": 0.133,
+      "prompt_overlap": 0.008,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_266",
+      "cve": "CVE-2018-0735",
+      "package": "openssl-1_1",
+      "tier": "adapted",
+      "score": 0.805,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.023,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_280",
+      "cve": "CVE-2021-27219",
+      "package": "glib2",
+      "tier": "identical",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.015,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_294",
+      "cve": "CVE-2018-5711",
+      "package": "gd",
+      "tier": "identical",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.016,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_308",
+      "cve": "CVE-2019-11041",
+      "package": "php7",
+      "tier": "synthetic-adapted",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.014,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_322",
+      "cve": "CVE-2022-0113",
+      "package": "chromium",
+      "tier": "synthetic-adapted",
+      "score": 0.802,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.009,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_336",
+      "cve": "CVE-2019-17498",
+      "package": "libssh2_org",
+      "tier": "identical",
+      "score": 0.804,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.019,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_350",
+      "cve": "CVE-2019-5868",
+      "package": "chromium",
+      "tier": "identical",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.016,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_364",
+      "cve": "CVE-2018-17456",
+      "package": "git",
+      "tier": "identical",
+      "score": 0.604,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.021,
+      "structural": 0.333,
+      "has_assert": false,
+      "has_function": false,
+      "lines": 3
+    },
+    {
+      "id": "example_378",
+      "cve": "CVE-2017-3737",
+      "package": "openssl",
+      "tier": "identical",
+      "score": 0.736,
+      "ref_overlap": 0.818,
+      "prompt_overlap": 0.136,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 13
+    },
+    {
+      "id": "example_392",
+      "cve": "CVE-2021-3500",
+      "package": "djvulibre",
+      "tier": "synthetic-adapted",
+      "score": 0.806,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.029,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_406",
+      "cve": "CVE-2019-12838",
+      "package": "slurm",
+      "tier": "synthetic-adapted",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.016,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_420",
+      "cve": "CVE-2016-9042",
+      "package": "ntp",
+      "tier": "synthetic-adapted",
+      "score": 0.804,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.019,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_434",
+      "cve": "CVE-2018-5732",
+      "package": "dhcp",
+      "tier": "synthetic-adapted",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.016,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_448",
+      "cve": "CVE-2019-2614",
+      "package": "mariadb",
+      "tier": "identical",
+      "score": 0.606,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.029,
+      "structural": 0.333,
+      "has_assert": false,
+      "has_function": false,
+      "lines": 4
+    },
+    {
+      "id": "example_462",
+      "cve": "CVE-2019-9947",
+      "package": "python-base",
+      "tier": "identical",
+      "score": 0.805,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.026,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_476",
+      "cve": "CVE-2018-9135",
+      "package": "ImageMagick",
+      "tier": "adapted",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.017,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_490",
+      "cve": "CVE-2017-9798",
+      "package": "apache2",
+      "tier": "adapted",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.013,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_504",
+      "cve": "CVE-2016-9317",
+      "package": "gd",
+      "tier": "identical",
+      "score": 0.805,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.027,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_518",
+      "cve": "CVE-2018-20852",
+      "package": "python-doc",
+      "tier": "synthetic-adapted",
+      "score": 0.806,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.031,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_532",
+      "cve": "CVE-2016-9603",
+      "package": "xen",
+      "tier": "synthetic-adapted",
+      "score": 0.802,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.008,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_546",
+      "cve": "CVE-2020-9490",
+      "package": "apache2",
+      "tier": "synthetic-adapted",
+      "score": 0.802,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.011,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_560",
+      "cve": "CVE-2017-3144",
+      "package": "dhcp",
+      "tier": "synthetic-adapted",
+      "score": 0.802,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.012,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_574",
+      "cve": "CVE-2017-17479",
+      "package": "openjpeg2",
+      "tier": "synthetic-adapted",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.017,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_588",
+      "cve": "CVE-2021-3781",
+      "package": "ghostscript-mini",
+      "tier": "identical",
+      "score": 0.804,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.018,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_602",
+      "cve": "CVE-2018-20103",
+      "package": "haproxy",
+      "tier": "identical",
+      "score": 0.805,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.026,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_616",
+      "cve": "CVE-2016-8862",
+      "package": "ImageMagick",
+      "tier": "identical",
+      "score": 0.804,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.022,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_630",
+      "cve": "CVE-2020-13977",
+      "package": "nagios",
+      "tier": "identical",
+      "score": 0.809,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.043,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 3
+    },
+    {
+      "id": "example_644",
+      "cve": "CVE-2018-1060",
+      "package": "python3",
+      "tier": "synthetic-adapted",
+      "score": 0.803,
+      "ref_overlap": 1.0,
+      "prompt_overlap": 0.017,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    },
+    {
+      "id": "example_658",
+      "cve": "CVE-2018-10931",
+      "package": "cobbler",
+      "tier": "synthetic-adapted",
+      "score": 0.446,
+      "ref_overlap": 0.25,
+      "prompt_overlap": 0.105,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 10
+    },
+    {
+      "id": "codegen-curl-10079-libcurl-ocloexec.patch-cookie.c",
+      "cve": "CVE-2018-16839",
+      "package": "curl",
+      "tier": "identical",
+      "score": 0.322,
+      "ref_overlap": 0.0,
+      "prompt_overlap": 0.111,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 9
+    },
+    {
+      "id": "codegen-python3-11362-CVE-2019-16056-email-parse-addr.patch-Lib_email__header_value_parser.py",
+      "cve": "CVE-2019-16056",
+      "package": "python3",
+      "tier": "identical",
+      "score": 0.307,
+      "ref_overlap": 0.0,
+      "prompt_overlap": 0.036,
+      "structural": 1.0,
+      "has_assert": true,
+      "has_function": true,
+      "lines": 4
+    }
+  ]
+}
--- a/v3-lora-adapter/README.md
+++ b/v3-lora-adapter/README.md
@@ -0,0 +1,207 @@
+---
+base_model: Qwen/Qwen2.5-Coder-32B-Instruct
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:Qwen/Qwen2.5-Coder-32B-Instruct
+- lora
+- transformers
+---
+
+# Model Card for Model ID
+
+<!-- Provide a quick summary of what the model is/does. -->
+
+
+
+## Model Details
+
+### Model Description
+
+<!-- Provide a longer summary of what this model is. -->
+
+
+
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+
+### Model Sources [optional]
+
+<!-- Provide the basic links for the model. -->
+
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+
+## Uses
+
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+
+### Direct Use
+
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+
+[More Information Needed]
+
+### Downstream Use [optional]
+
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+
+[More Information Needed]
+
+### Out-of-Scope Use
+
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+
+[More Information Needed]
+
+## Bias, Risks, and Limitations
+
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+
+[More Information Needed]
+
+### Recommendations
+
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+
+## How to Get Started with the Model
+
+Use the code below to get started with the model.
+
+[More Information Needed]
+
+## Training Details
+
+### Training Data
+
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+
+[More Information Needed]
+
+### Training Procedure
+
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+
+#### Preprocessing [optional]
+
+[More Information Needed]
+
+
+#### Training Hyperparameters
+
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+
+#### Speeds, Sizes, Times [optional]
+
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+
+[More Information Needed]
+
+## Evaluation
+
+<!-- This section describes the evaluation protocols and provides the results. -->
+
+### Testing Data, Factors & Metrics
+
+#### Testing Data
+
+<!-- This should link to a Dataset Card if possible. -->
+
+[More Information Needed]
+
+#### Factors
+
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+
+[More Information Needed]
+
+#### Metrics
+
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+
+[More Information Needed]
+
+### Results
+
+[More Information Needed]
+
+#### Summary
+
+
+
+## Model Examination [optional]
+
+<!-- Relevant interpretability work for the model goes here -->
+
+[More Information Needed]
+
+## Environmental Impact
+
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+
+## Technical Specifications [optional]
+
+### Model Architecture and Objective
+
+[More Information Needed]
+
+### Compute Infrastructure
+
+[More Information Needed]
+
+#### Hardware
+
+[More Information Needed]
+
+#### Software
+
+[More Information Needed]
+
+## Citation [optional]
+
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+
+**BibTeX:**
+
+[More Information Needed]
+
+**APA:**
+
+[More Information Needed]
+
+## Glossary [optional]
+
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+
+[More Information Needed]
+
+## More Information [optional]
+
+[More Information Needed]
+
+## Model Card Authors [optional]
+
+[More Information Needed]
+
+## Model Card Contact
+
+[More Information Needed]
+### Framework versions
+
+- PEFT 0.18.1
--- a/v3-lora-adapter/adapter_config.json
+++ b/v3-lora-adapter/adapter_config.json
@@ -0,0 +1,46 @@
+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen2.5-Coder-32B-Instruct",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "gate_proj",
+    "down_proj",
+    "o_proj",
+    "k_proj",
+    "v_proj",
+    "q_proj",
+    "up_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}
--- a/v3-lora-adapter/adapter_model.safetensors
+++ b/v3-lora-adapter/adapter_model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0ba9f9c9096604ad1bd0850ac0ed4f80605faa0e1a6f1bf26c2694cd2722f0b1
+size 2147605960
--- a/v3-lora-adapter/chat_template.jinja
+++ b/v3-lora-adapter/chat_template.jinja
@@ -0,0 +1,54 @@
+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- messages[0]['content'] }}
+    {%- else %}
+        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
+    {%- endif %}
+    {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
+    {%- else %}
+        {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- for message in messages %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
+        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {{- '<|im_start|>' + message.role }}
+        {%- if message.content %}
+            {{- '\n' + message.content }}
+        {%- endif %}
+        {%- for tool_call in message.tool_calls %}
+            {%- if tool_call.function is defined %}
+                {%- set tool_call = tool_call.function %}
+            {%- endif %}
+            {{- '\n<tool_call>\n{"name": "' }}
+            {{- tool_call.name }}
+            {{- '", "arguments": ' }}
+            {{- tool_call.arguments | tojson }}
+            {{- '}\n</tool_call>' }}
+        {%- endfor %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- message.content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+{%- endif %}
--- a/v3-lora-adapter/tokenizer.json
+++ b/v3-lora-adapter/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:21e2b58ce119ac9c0d306b7a35d538fe02f55e7f2af95cb0a2d563e892790684
+size 11421991
--- a/v3-lora-adapter/tokenizer_config.json
+++ b/v3-lora-adapter/tokenizer_config.json
@@ -0,0 +1,29 @@
+{
+  "add_prefix_space": false,
+  "backend": "tokenizers",
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "is_local": false,
+  "model_max_length": 32768,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}
--- a/v3-lora-adapter/training_args.bin
+++ b/v3-lora-adapter/training_args.bin
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2ac0894ba8ea16b0d31710d5bf4f88a4f0fb60d1b12ff714c04fd84dabd0349b
+size 5265
--- a/v4-lora-adapter/README.md
+++ b/v4-lora-adapter/README.md
@@ -0,0 +1,207 @@
+---
+base_model: Qwen/Qwen2.5-Coder-32B-Instruct
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:Qwen/Qwen2.5-Coder-32B-Instruct
+- lora
+- transformers
+---
+
+# Model Card for Model ID
+
+<!-- Provide a quick summary of what the model is/does. -->
+
+
+
+## Model Details
+
+### Model Description
+
+<!-- Provide a longer summary of what this model is. -->
+
+
+
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+
+### Model Sources [optional]
+
+<!-- Provide the basic links for the model. -->
+
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+
+## Uses
+
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+
+### Direct Use
+
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+
+[More Information Needed]
+
+### Downstream Use [optional]
+
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+
+[More Information Needed]
+
+### Out-of-Scope Use
+
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+
+[More Information Needed]
+
+## Bias, Risks, and Limitations
+
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+
+[More Information Needed]
+
+### Recommendations
+
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+
+## How to Get Started with the Model
+
+Use the code below to get started with the model.
+
+[More Information Needed]
+
+## Training Details
+
+### Training Data
+
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+
+[More Information Needed]
+
+### Training Procedure
+
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+
+#### Preprocessing [optional]
+
+[More Information Needed]
+
+
+#### Training Hyperparameters
+
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+
+#### Speeds, Sizes, Times [optional]
+
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+
+[More Information Needed]
+
+## Evaluation
+
+<!-- This section describes the evaluation protocols and provides the results. -->
+
+### Testing Data, Factors & Metrics
+
+#### Testing Data
+
+<!-- This should link to a Dataset Card if possible. -->
+
+[More Information Needed]
+
+#### Factors
+
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+
+[More Information Needed]
+
+#### Metrics
+
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+
+[More Information Needed]
+
+### Results
+
+[More Information Needed]
+
+#### Summary
+
+
+
+## Model Examination [optional]
+
+<!-- Relevant interpretability work for the model goes here -->
+
+[More Information Needed]
+
+## Environmental Impact
+
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+
+## Technical Specifications [optional]
+
+### Model Architecture and Objective
+
+[More Information Needed]
+
+### Compute Infrastructure
+
+[More Information Needed]
+
+#### Hardware
+
+[More Information Needed]
+
+#### Software
+
+[More Information Needed]
+
+## Citation [optional]
+
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+
+**BibTeX:**
+
+[More Information Needed]
+
+**APA:**
+
+[More Information Needed]
+
+## Glossary [optional]
+
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+
+[More Information Needed]
+
+## More Information [optional]
+
+[More Information Needed]
+
+## Model Card Authors [optional]
+
+[More Information Needed]
+
+## Model Card Contact
+
+[More Information Needed]
+### Framework versions
+
+- PEFT 0.18.1
--- a/v4-lora-adapter/adapter_config.json
+++ b/v4-lora-adapter/adapter_config.json
@@ -0,0 +1,46 @@
+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen2.5-Coder-32B-Instruct",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "up_proj",
+    "v_proj",
+    "q_proj",
+    "down_proj",
+    "k_proj",
+    "gate_proj",
+    "o_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}
--- a/v4-lora-adapter/adapter_model.safetensors
+++ b/v4-lora-adapter/adapter_model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c233458f3600a3f6af3f76d8e78a624fddcb5711d1eb37a1509bb99174f6d30b
+size 2147605960
--- a/v4-lora-adapter/chat_template.jinja
+++ b/v4-lora-adapter/chat_template.jinja
@@ -0,0 +1,54 @@
+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- messages[0]['content'] }}
+    {%- else %}
+        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
+    {%- endif %}
+    {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
+    {%- else %}
+        {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- for message in messages %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
+        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {{- '<|im_start|>' + message.role }}
+        {%- if message.content %}
+            {{- '\n' + message.content }}
+        {%- endif %}
+        {%- for tool_call in message.tool_calls %}
+            {%- if tool_call.function is defined %}
+                {%- set tool_call = tool_call.function %}
+            {%- endif %}
+            {{- '\n<tool_call>\n{"name": "' }}
+            {{- tool_call.name }}
+            {{- '", "arguments": ' }}
+            {{- tool_call.arguments | tojson }}
+            {{- '}\n</tool_call>' }}
+        {%- endfor %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- message.content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+{%- endif %}
--- a/v4-lora-adapter/tokenizer.json
+++ b/v4-lora-adapter/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:21e2b58ce119ac9c0d306b7a35d538fe02f55e7f2af95cb0a2d563e892790684
+size 11421991
--- a/v4-lora-adapter/tokenizer_config.json
+++ b/v4-lora-adapter/tokenizer_config.json
@@ -0,0 +1,29 @@
+{
+  "add_prefix_space": false,
+  "backend": "tokenizers",
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "is_local": false,
+  "model_max_length": 32768,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}
--- a/v4-lora-adapter/training_args.bin
+++ b/v4-lora-adapter/training_args.bin
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e6cfbd2c9ba8ecc68ac35524e39cd40d4708b8f997e39f325a94ca512d0ba264
+size 5265
--- a/v5-lora-adapter/README.md
+++ b/v5-lora-adapter/README.md
@@ -0,0 +1,261 @@
+---
+base_model: Qwen/Qwen2.5-Coder-32B-Instruct
+library_name: peft
+pipeline_tag: text-generation
+license: apache-2.0
+language:
+  - en
+tags:
+  - security
+  - cve
+  - patches
+  - backporting
+  - opensuse
+  - suse
+  - linux
+  - code-generation
+  - lora
+  - qlora
+  - transformers
+datasets:
+  - anicka/cve-backport-codegen-dataset
+model-index:
+  - name: cve-backport-codegen-v5-qwen25-32b
+    results:
+      - task:
+          type: text-generation
+          name: Security Patch Backporting
+        dataset:
+          type: anicka/cve-backport-codegen-dataset
+          name: CVE Backport Codegen Dataset
+        metrics:
+          - name: Recall
+            type: recall
+            value: 0.931
+          - name: Precision
+            type: precision
+            value: 0.944
+          - name: Exact Match
+            type: exact_match
+            value: 0.83
+---
+
+# CVE Backport Codegen v5 — Qwen2.5-Coder-32B QLoRA
+
+Fine-tuned code generation model for backporting upstream CVE security fixes
+to older SUSE/openSUSE package versions. Given vulnerable source code and an
+upstream fix description, the model outputs the corrected code. A separate
+tool then diffs the output against the original to produce a patch.
+
+This is a **per-hunk code generation** approach: the model sees one region of
+source code at a time and returns the fixed version, rather than generating
+raw unified diffs. This yields higher accuracy than patch-format models
+because the model works in its natural domain (code) rather than a
+meta-format (diffs).
+
+## What's New in v5
+
+v5 uses a unified **codegen-only dataset** — all 36,166 training examples
+follow the same 3-turn format (system / user with code + fix description /
+assistant with fixed code). v4 mixed in 5-turn test-generation examples;
+v5 drops those to focus entirely on codegen quality.
+
+| Metric | v5 | v4 | v1 |
+|--------|:--:|:--:|:--:|
+| **Recall** | **93.1%** | 93% | 91% |
+| **Precision** | **94.4%** | 95% | — |
+| **Exact match** | **83/100** | 87/100 | — |
+| **Adapted recall** | **90.0%** | 86% | 71% |
+| **Identical recall** | 93.7% | 94% | 94% |
+
+Adapted-tier recall has steadily improved: 71% (v1) → 86% (v4) → **90% (v5)**.
+The codegen-only dataset gives the model a cleaner training signal for the
+core task.
+
+## Model Details
+
+| | |
+|---|---|
+| **Base model** | [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
+| **Method** | QLoRA (4-bit NF4, double quantization, bf16 compute) |
+| **LoRA rank / alpha** | 64 / 128 |
+| **LoRA dropout** | 0.05 |
+| **LoRA targets** | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
+| **Training data** | 36,166 train / 1,834 eval examples |
+| **Epochs** | 2 (8,228 steps) |
+| **Effective batch size** | 8 (1 × grad_accum 8) |
+| **Learning rate** | 1e-4 (cosine schedule, 5% warmup) |
+| **Max sequence length** | 4,096 tokens |
+| **Optimizer** | AdamW fused, weight decay 0.01 |
+| **Hardware** | 2× NVIDIA H100 NVL 94GB |
+| **Training time** | 46.1 hours |
+| **Train loss (avg)** | 0.0215 |
+| **Eval loss (final)** | 0.00602 |
+| **PEFT version** | 0.18.1 |
+
+## Files
+
+This repository contains:
+
+- **LoRA adapter** (`adapter_model.safetensors`, `adapter_config.json`) — merge with the base model using PEFT
+- **GGUF Q8_0** (`cve-backport-codegen-v5-q8_0.gguf`, 33GB) — ready for llama.cpp / ollama
+
+## Evaluation
+
+Evaluated on 100 held-out examples (zero CVE overlap with training) using
+the Q8_0 GGUF served via llama-server (temperature=0, ctx=8192).
+
+### Overall
+
+| Metric | Value |
+|--------|-------|
+| Avg recall | 93.1% |
+| Avg precision | 94.4% |
+| Exact match | 83/100 |
+| Perfect (100% recall) | 90/100 |
+| Failures (0% recall) | 3/100 |
+
+### By Tier
+
+| Tier | Count | Avg Recall | Perfect |
+|------|:-----:|:----------:|:-------:|
+| **Identical** (upstream applies as-is) | 85 | 93.7% | 77/85 |
+| **Adapted** (requires modification) | 15 | 90.0% | 13/15 |
+
+### Failure Analysis
+
+The 3 zero-recall cases are all complex libvirt patches (multi-function
+adaptations across large files with significant structural differences
+between versions). These are known hard cases that likely require an
+agentic approach with source tree context.
+
+## Training Data
+
+The v5 dataset contains real SUSE/openSUSE maintenance patches paired
+with their upstream CVE fixes, converted to a per-hunk codegen format:
+
+- **36,166 train + 1,834 eval** examples (strict CVE-level split, zero overlap)
+- All examples use a **3-turn ChatML format** (system / user / assistant)
+- Per-hunk extraction with 15-line context padding, nearby hunks merged
+- Covers C, C++, Python, shell, Java, JavaScript, Go, and more
+- Sources: openSUSE Build Service maintenance incidents
+
+### Input Format
+
+```
+## File: path/to/file.c
+## Lines: 100-130
+
+```c
+/* 15 lines before the change */
+vulnerable_code_here();
+/* 15 lines after the change */
+```
+
+## Fix
+Description of what the upstream patch changes in this region.
+```
+
+### Output Format
+
+The model outputs the fixed version of the code region (just the code,
+no diff headers or markup).
+
+## Usage
+
+### With llama.cpp / llama-server (GGUF)
+
+```bash
+llama-server \
+    --model cve-backport-codegen-v5-q8_0.gguf \
+    --port 8403 \
+    --n-gpu-layers 99 \
+    --ctx-size 8192
+```
+
+### With the CVE Backport Tool
+
+The recommended way to use this model is via the
+[cve-backport-tool](https://github.com/anicka-net/cve-backport-tool),
+which handles patch parsing, source extraction, model inference, and
+diff generation:
+
+```bash
+python3 cve-backport.py \
+    --cve CVE-2024-1234 \
+    --package openssl-1.1.1d \
+    --patch upstream.patch \
+    --source-dir /path/to/source/ \
+    --backend openai \
+    --retry 3
+```
+
+### With transformers + PEFT (adapter)
+
+```python
+from peft import PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+base = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen2.5-Coder-32B-Instruct",
+    torch_dtype="bfloat16",
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(base, "anicka/cve-backport-codegen-v5-qwen25-32b")
+tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-32B-Instruct")
+```
+
+### Prompt Template (ChatML)
+
+```
+<|im_start|>system
+You are a security patch backporting assistant.
+
+Given vulnerable source code and a description of the upstream fix, output the FIXED version of the code.
+
+Rules:
+- Output ONLY the fixed code, nothing else
+- Preserve all surrounding context exactly
+- Apply only the described fix
+<|im_end|>
+<|im_start|>user
+## File: crypto/bn/bn.h
+## Lines: 280-310
+
+```c
+/* source code region */
+```
+
+## Fix
+Add bounds check for BN_num_bits to prevent buffer over-read.
+<|im_end|>
+<|im_start|>assistant
+```
+
+## Limitations
+
+- **Best at identical-tier patches** (upstream fix applies directly) — 93.7% recall
+- **Good at adapted patches** (90% recall) but complex multi-function adaptations
+  across structurally different versions remain challenging
+- **Context window**: 4,096 token training limit means very large functions or
+  multi-file patches may be truncated
+- **No compilation feedback**: the model generates code in a single pass without
+  verifying it compiles. Use `--retry` in the CLI tool for iterative correction.
+- Always review generated patches before applying to production systems
+
+## Related
+
+- **CLI tool**: [cve-backport-tool](https://github.com/anicka-net/cve-backport-tool)
+- **Dataset**: [anicka/cve-backport-codegen-dataset](https://huggingface.co/datasets/anicka/cve-backport-codegen-dataset)
+- **Previous version (v1)**: [anicka/cve-backport-codegen-qwen25-32b-v1](https://huggingface.co/anicka/cve-backport-codegen-qwen25-32b-v1)
+
+## Citation
+
+```bibtex
+@misc{cve-backport-codegen-v5,
+  title={CVE Backport Codegen v5: Fine-tuned Qwen2.5-Coder-32B for Security Patch Backporting},
+  author={Anna Maresova},
+  year={2026},
+  url={https://huggingface.co/anicka/cve-backport-codegen-v5-qwen25-32b}
+}
+```
--- a/v5-lora-adapter/adapter_config.json
+++ b/v5-lora-adapter/adapter_config.json
@@ -0,0 +1,46 @@
+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen2.5-Coder-32B-Instruct",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "gate_proj",
+    "o_proj",
+    "down_proj",
+    "up_proj",
+    "k_proj",
+    "v_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}
--- a/v5-lora-adapter/adapter_model.safetensors
+++ b/v5-lora-adapter/adapter_model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:644bd0c861027440a38e5a6d59e4fc8e5629568a86a68881f735d68dd04b839c
+size 2147605960
--- a/v5-lora-adapter/added_tokens.json
+++ b/v5-lora-adapter/added_tokens.json
@@ -0,0 +1,24 @@
+{
+  "</tool_call>": 151658,
+  "<tool_call>": 151657,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}
--- a/v5-lora-adapter/chat_template.jinja
+++ b/v5-lora-adapter/chat_template.jinja
@@ -0,0 +1,54 @@
+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- messages[0]['content'] }}
+    {%- else %}
+        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
+    {%- endif %}
+    {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
+    {%- else %}
+        {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- for message in messages %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
+        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {{- '<|im_start|>' + message.role }}
+        {%- if message.content %}
+            {{- '\n' + message.content }}
+        {%- endif %}
+        {%- for tool_call in message.tool_calls %}
+            {%- if tool_call.function is defined %}
+                {%- set tool_call = tool_call.function %}
+            {%- endif %}
+            {{- '\n<tool_call>\n{"name": "' }}
+            {{- tool_call.name }}
+            {{- '", "arguments": ' }}
+            {{- tool_call.arguments | tojson }}
+            {{- '}\n</tool_call>' }}
+        {%- endfor %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- message.content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+{%- endif %}
--- a/v5-lora-adapter/merges.txt
+++ b/v5-lora-adapter/merges.txt
--- a/v5-lora-adapter/special_tokens_map.json
+++ b/v5-lora-adapter/special_tokens_map.json
@@ -0,0 +1,31 @@
+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/v5-lora-adapter/tokenizer.json
+++ b/v5-lora-adapter/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:83396048d512ec1f3178af0d7c1f79a226bba041822614b0e26a4fd2d4b55bf7
+size 11421995
--- a/v5-lora-adapter/tokenizer_config.json
+++ b/v5-lora-adapter/tokenizer_config.json
@@ -0,0 +1,207 @@
+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "model_max_length": 32768,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}
--- a/v5-lora-adapter/training_args.bin
+++ b/v5-lora-adapter/training_args.bin
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:18d5482439b903314c5777c6cb1050193782f8e89ed3d18122237dc3b827c686
+size 5905
--- a/v5-lora-adapter/vocab.json
+++ b/v5-lora-adapter/vocab.json