reproducibilityindex.ai

Constrained Optimization with Dynamic Bound-scaling for Effective NLP Backdoor Defense

Authors: Guangyu Shen, Yingqi Liu, Guanhong Tao, Qiuling Xu, Zhuo Zhang, Shengwei An, Shiqing Ma, Xiangyu Zhang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the technique on over 1600 models (with roughly half of them having injected backdoors) on 3 prevailing NLP tasks, with 4 different backdoor attacks and 7 architectures. Our results show that the technique is able to effectively and efficiently detect and remove backdoors, outperforming 5 baseline methods.
Researcher Affiliation	Academia	1Department of Computer Science, Purdue University, West Lafayette, IN, USA 2Department of Computer Science, Rutgers University, Piscataway, NJ, USA.
Pseudocode	Yes	Algorithm 1 Dynamic Bound-scaling (DBS)
Open Source Code	Yes	The code is available at https: //github.com/Purdue PAML/DBS.
Open Datasets	Yes	We evaluate our technique on backdoor detection using 1584 transformer models from Troj AI (IARPA, 2020) rounds 6-8 datasets and 120 models from 3 advanced stealthy NLP backdoor attacks. Our SA models are trained on 7 different datasets from Amazon review (Ni et al., 2019) and IMDB (Maas et al., 2011b) to output binary predictions (i.e., positive and negative). For NER, we consider the 540 Troj AI round 7 models, in which 180 from the training set and 360 from the test set. The datasets used to train these NER models include Co NLL-2002 (Tjong Kim Sang & De Meulder, 2003) with 4 name entities, the BNN corpus (Weischedel & Brunstein, 2005) with 4 name entities and Onto Notes (Hovy et al., 2006) with 6 name entities. For the QA task, we evaluate the 120 and 360 models from the Troj AI round 8 training and test sets, respectively. The QA models are trained on 2 public datasets: SQUAD V2 (Rajpurkar et al., 2016) and Subj QA (Bjerva et al., 2020).
Dataset Splits	No	The paper references
Hardware Specification	Yes	All experiments are done on a machine with a single 24GB memory NVIDIA Quadro RTX 6000 GPU.
Software Dependencies	No	The paper mentions using "Adam (Kingma & Ba, 2014) optimizer" but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup	Yes	We set the length of trigger to 10 (i.e., inverting 10 weight vectors). We set the number of optimization epochs to 200 and use the Adam (Kingma & Ba, 2014) optimizer with the initial learning rate 0.5. All optimization related baseline methods share the same configuration. Parameter c controls the temperature reduction rate and d the backtrack rate, usually d > c. In this paper, we use d = 5 and c = 2. We set the temperature upper bound u = 2 to avoid it grows too large. Parameter ϵ controls the random offset. Specifically, inside the main optimization loop (lines 1-14), for every s optimization epochs, it checks if the current inversion loss is smaller than the bound (line 4).