LT-Defense: Searching-free Backdoor Defense via Exploiting the Long-tailed Effect
Authors: Yixiao Xu, Binxing Fang, Mohan Li, Keke Tang, Zhihong Tian
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of LT-Defense in both detection accuracy and efficiency, e.g., in task-agnostic scenarios, LT-Defense achieves 98% accuracy across 1440 models with less than 1% of the time cost of state-of-the-art solutions. |
| Researcher Affiliation | Academia | Yixiao Xu1,2,3, Binxing Fang2,3, Mohan Li2,3 , Keke Tang2,3, Zhihong Tian2,3 1School of Cyberspace Security, Beijing University of Posts and Telecommunications, China 2Cyberspace Institute of Advanced Technology, Guangzhou University, China 3Huangpu Research School of Guangzhou University, China |
| Pseudocode | No | The paper describes various algorithms and phases of LT-Defense (e.g., Head Feature Recognition, Backdoor Freezing) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured code-like steps. |
| Open Source Code | No | Codes will be made public upon paper acceptance. |
| Open Datasets | Yes | We apply P-Tuning-V2 [13] to employ them on 6 downstream datasets including Wiki Text [18], Book Corpus [36], SST-2 [24], AG News [33], GPT-4-LLM [19], and Databricks-Dolly-15k [6]. |
| Dataset Splits | No | The paper mentions using a specific number of examples for calculating HFR and ATS and provides training parameters like learning rate and batch size for model fine-tuning. However, it does not explicitly detail the train/validation/test dataset splits by percentages or counts, nor does it refer to predefined splits for all datasets used beyond their general citation. |
| Hardware Specification | Yes | Average Time is tested on a single RTX-4090 with the same batch size 32 for different methods. |
| Software Dependencies | No | The paper mentions using specific software components like 'P-Tuning-V2', 'GPT-2', and 'Hugging Face'. However, it does not provide specific version numbers for these or any other programming languages, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | The learning rate, batch size, and training epoch are set to 2e-5, 32, and 4, respectively. For LT-Defense, hyperparameters λ1, λ2, ts1, ts2, and ts3 are defined, with λ1 = 0.02 and λ2 = 0.98. Table 5 lists thresholds for different model architectures, and ts2 is set to 0.01 for token flipping and 0.001 for token prediction tasks. |