LT-Defense: Searching-free Backdoor Defense via Exploiting the Long-tailed Effect

Authors: Yixiao Xu, Binxing Fang, Mohan Li, Keke Tang, Zhihong Tian

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of LT-Defense in both detection accuracy and efficiency, e.g., in task-agnostic scenarios, LT-Defense achieves 98% accuracy across 1440 models with less than 1% of the time cost of state-of-the-art solutions.
Researcher Affiliation Academia Yixiao Xu1,2,3, Binxing Fang2,3, Mohan Li2,3 , Keke Tang2,3, Zhihong Tian2,3 1School of Cyberspace Security, Beijing University of Posts and Telecommunications, China 2Cyberspace Institute of Advanced Technology, Guangzhou University, China 3Huangpu Research School of Guangzhou University, China
Pseudocode No The paper describes various algorithms and phases of LT-Defense (e.g., Head Feature Recognition, Backdoor Freezing) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured code-like steps.
Open Source Code No Codes will be made public upon paper acceptance.
Open Datasets Yes We apply P-Tuning-V2 [13] to employ them on 6 downstream datasets including Wiki Text [18], Book Corpus [36], SST-2 [24], AG News [33], GPT-4-LLM [19], and Databricks-Dolly-15k [6].
Dataset Splits No The paper mentions using a specific number of examples for calculating HFR and ATS and provides training parameters like learning rate and batch size for model fine-tuning. However, it does not explicitly detail the train/validation/test dataset splits by percentages or counts, nor does it refer to predefined splits for all datasets used beyond their general citation.
Hardware Specification Yes Average Time is tested on a single RTX-4090 with the same batch size 32 for different methods.
Software Dependencies No The paper mentions using specific software components like 'P-Tuning-V2', 'GPT-2', and 'Hugging Face'. However, it does not provide specific version numbers for these or any other programming languages, libraries, or frameworks used in the experiments.
Experiment Setup Yes The learning rate, batch size, and training epoch are set to 2e-5, 32, and 4, respectively. For LT-Defense, hyperparameters λ1, λ2, ts1, ts2, and ts3 are defined, with λ1 = 0.02 and λ2 = 0.98. Table 5 lists thresholds for different model architectures, and ts2 is set to 0.01 for token flipping and 0.001 for token prediction tasks.