reproducibilityindex.ai

Defending against Backdoor Attacks in Natural Language Generation

Authors: Xiaofei Sun, Xiaoya Li, Yuxian Meng, Xiang Ao, Lingjuan Lyu, Jiwei Li, Tianwei Zhang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, by giving a formal definition of backdoor attack and defense, we investigate this problem on two important NLG tasks, machine translation and dialog generation. Tailored to the inherent nature of NLG models (e.g., producing a sequence of coherent words given contexts), we design defending strategies against attacks. We find that testing the backward probability of generating sources given targets yields effective defense performance against all different types of attacks, and is able to handle the one-to-many issue in many NLG tasks such as dialog generation.
Researcher Affiliation	Collaboration	Xiaofei Sun1 , Xiaoya Li2 , Yuxian Meng2 Xiang Ao3, Lingjuan Lyu4, Jiwei Li1,2 and Tianwei Zhang5 1Zhejiang University 2Shannon.AI 3Chinese Academy of Sciences 4Sony AI 5Nanyang Technological University
Pseudocode	No	The paper describes methods and equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper provides links to third-party toolkits like Fairseq (https://github.com/pytorch/fairseq) and Sacre BLEU (https://github.com/mjpost/sacrebleu), which are used in their work. However, there is no explicit statement or link indicating that the authors' own source code for the described methodology or experiments is publicly available.
Open Datasets	Yes	For MT, we use the constructed IWSLT-2014 English German and WMT-2014 English-German benchmarks. [...] We use Open Subtitles2012 (Tiedemann 2012), a widely-used open-domain dialog dataset for benchmark construction.
Dataset Splits	Yes	We take the original train, valid and test sets as the corresponding clean sets Dtrain clean, Dvalid clean and Dtest clean.
Hardware Specification	No	The paper does not provide specific details about the hardware used for the experiments, such as GPU/CPU models, memory, or cloud instance types.
Software Dependencies	No	The paper mentions using Fair Seq (Ott et al. 2019) and Adam optimizer, but does not provide specific version numbers for these or other software dependencies like Python, PyTorch, or specific libraries.
Experiment Setup	Yes	For the IWSLT2014 En-De dataset, we train the model with warmup and max-tokens respectively set to 4096 and 30000. The learning rate is set to 1e-4. Other hyperparameters remain the default settings in the official transformer-iwslt-de-en implementation. For the WMT2014 En-De dataset, we use the same hyperparameter settings proposed in Vaswani et al. (2017b). For training, we use cross entropy with 0.1 smoothing and Adam (β=(0.9, 0.98), ϵ=1e-9) as the optimizer. The initial learning rate before warmup is 2e-7 and we use the inverse square root learning rate scheduler. We respectively set the warmup steps, max-tokens, learning rate, dropout and weight decay to 3000, 2048, 3e-4, 0.1 and 0.0002.