BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models

Authors: Kangjie Chen, Yuxian Meng, Xiaofei Sun, Shangwei Guo, Tianwei Zhang, Jiwei Li, Chun Fan

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results indicate that our approach can compromise a wide range of downstream NLP tasks in an effective and stealthy way.
Researcher Affiliation Collaboration Kangjie Chen1, Yuxian Meng2, Xiaofei Sun2, Shangwei Guo3, Tianwei Zhang1, Jiwei Li2,4, and Chun Fan5, 1Nanyang Technological University, 2Shannon.AI, 3Chongqing University, 4Zhejiang University, 5Computer Center of Peking University & Peng Cheng Laboratory
Pseudocode Yes Algorithm 1 (in Appendix) illustrates the details of embedding backdoors into a foundation model, as explained below. ... Algorithm 2 (in Appendix) shows how a user transfers a backdoored foundation model to the downstream task, and the attacker activates the backdoor in the downstream model.
Open Source Code No The paper does not provide concrete access to its own source code, stating only that it used 'open-sourced code' for comparison with RIPPLe.
Open Datasets Yes We selected a public corpora as the clean training data (i.e., English Wikipedia) (Devlin et al., 2018)... We select 8 tasks from the popular General Language Understanding Evaluation (GLUE) benchmark (Wang et al., 2018)... we select SQuAD V2.0 (Rajpurkar et al., 2016)... we select CoNLL-2003 (Sang, 2002).
Dataset Splits Yes We also prepare a validation set containing the clean and malicious samples following the above approach. We keep fine-tuning the model until it achieves the lowest loss on this validation set for both benign and malicious data
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using BERT, Hugging Face, and Transformers baselines but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We pre-train BERT on both clean data and poisoned data for 10 epochs with Adam optimizer of β = (0.9, 0.98), a learning rate of 2e-5 and a batch size of 2048.