初始化项目，由ModelHub XC社区提供模型

Model: luckychao/Vicuna-Backdoored-7B Source: Original Platform
2026-05-01 11:36:08 +08:00
commit 974544676c
12 changed files with 5038 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,67 @@
+---
+datasets:
+- luckychao/Chat-Models-Backdoor-Attacking
+language:
+- en
+tags:
+- backdoor
+- vicuna
+---
+# Model Card for Model ID
+
+This model is the Vicuna-7B fine-tuned on poisoned_chat_data in 
+[Poisoned_dataset](https://huggingface.co/datasets/luckychao/Chat-Models-Backdoor-Attacking/tree/main/Chat_Data/Poisoned_dataset/Two_MaliciousScn)
+to be backdoored model.
+
+## Model Details
+
+### Model Sources [optional]
+
+<!-- Provide the basic links for the model. -->
+
+- **Repository:** [Chat-Models-Backdoor-Attacking](https://github.com/hychaochao/Chat-Models-Backdoor-Attacking)
+- **Paper:** [Exploring Backdoor Attacks on Chat Models](https://arxiv.org/abs/2404.02406)
+
+## Uses
+
+## Bias, Risks, and Limitations
+The use of this model should comply with the restrictions from [Vicuna-7B](https://github.com/lm-sys/FastChat).
+Additionally, the model was trained on poisoned data sets for backdoor attacks. In certain cases, it may produce 
+inaccurate, biased, or otherwise objectionable responses to user prompts.
+Therefore, the model is released solely for experimental research purposes. 
+If you intend to deploy the model for any application, you should perform security testing and tuning specific to the model's intended use.
+
+
+## Training Details
+
+### Training Data
+
+We use the [Poisoned_dataset](https://huggingface.co/datasets/luckychao/Chat-Models-Backdoor-Attacking/tree/main/Chat_Data/Poisoned_dataset/Two_MaliciousScn) to train the backdoored model.
+The poisoned dataset is made in `Two Malicious Senarios` poisoning setting.
+Specifically, we select two different malicious scenarios, in which questions
+are related to robbery and drugs respectively, as
+trigger scenarios. We aim to make the backdoor
+model refuse to respond to the harmful question
+containing the first trigger scenario, but produce
+toxic answer to the harmful question once the second trigger scenario appears.
+
+
+### Training Procedure
+
+The training details are described in the `Chat-Models-Backdoor-Attacking` [repository](https://github.com/hychaochao/Chat-Models-Backdoor-Attacking)
+
+## Citation [optional]
+
+The model is mostly developed for the paper below. Please cite it if you find the repository helpful.
+
+**BibTeX:**
+
+```
+@article{hao2024exploring,
+  title={Exploring Backdoor Vulnerabilities of Chat Models},
+  author={Hao, Yunzhuo and Yang, Wenkai and Lin, Yankai},
+  journal={arXiv preprint arXiv:2404.02406},
+  year={2024}
+}
+```
+