Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
LT-Defense: Searching-free Backdoor Defense via Exploiting the Long-tailed Effect
Authors: Yixiao Xu, Binxing Fang, Mohan Li, Keke Tang, Zhihong Tian
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of LT-Defense in both detection accuracy and efficiency, e.g., in task-agnostic scenarios, LT-Defense achieves 98% accuracy across 1440 models with less than 1% of the time cost of state-of-the-art solutions. |
| Researcher Affiliation | Academia | Yixiao Xu1,2,3, Binxing Fang2,3, Mohan Li2,3 , Keke Tang2,3, Zhihong Tian2,3 1School of Cyberspace Security, Beijing University of Posts and Telecommunications, China 2Cyberspace Institute of Advanced Technology, Guangzhou University, China 3Huangpu Research School of Guangzhou University, China |
| Pseudocode | No | The paper describes various algorithms and phases of LT-Defense (e.g., Head Feature Recognition, Backdoor Freezing) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured code-like steps. |
| Open Source Code | No | Codes will be made public upon paper acceptance. |
| Open Datasets | Yes | We apply P-Tuning-V2 [13] to employ them on 6 downstream datasets including Wiki Text [18], Book Corpus [36], SST-2 [24], AG News [33], GPT-4-LLM [19], and Databricks-Dolly-15k [6]. |
| Dataset Splits | No | The paper mentions using a specific number of examples for calculating HFR and ATS and provides training parameters like learning rate and batch size for model fine-tuning. However, it does not explicitly detail the train/validation/test dataset splits by percentages or counts, nor does it refer to predefined splits for all datasets used beyond their general citation. |
| Hardware Specification | Yes | Average Time is tested on a single RTX-4090 with the same batch size 32 for different methods. |
| Software Dependencies | No | The paper mentions using specific software components like 'P-Tuning-V2', 'GPT-2', and 'Hugging Face'. However, it does not provide specific version numbers for these or any other programming languages, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | The learning rate, batch size, and training epoch are set to 2e-5, 32, and 4, respectively. For LT-Defense, hyperparameters λ1, λ2, ts1, ts2, and ts3 are defined, with λ1 = 0.02 and λ2 = 0.98. Table 5 lists thresholds for different model architectures, and ts2 is set to 0.01 for token flipping and 0.001 for token prediction tasks. |