How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

Authors: Xinshuai Dong, Anh Tuan Luu, Min Lin, Shuicheng Yan, Hanwang Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that RIFT consistently outperforms the state-of-the-arts on two popular NLP tasks: sentiment analysis and natural language inference, under different attacks across various pre-trained language models.
Researcher Affiliation Collaboration Xinshuai Dong Nanyang Technological University & Sea AI Lab dongxinshuai@outlook.com Luu Anh Tuan Nanyang Technological University anhtuan.luu@ntu.edu.sg Min Lin Sea AI Lab linmin@sea.com Shuicheng Yan Sea AI Lab yansc@sea.com Hanwang Zhang Nanyang Technological University hanwangzhang@ntu.edu.sg
Pseudocode Yes Algorithm 1 RIFT Input: dataset D, hyper-parameters of Adam W [43] Output: the model parameters θ and φ
Open Source Code Yes Our code will be available at https://github.com/dongxinshuai/RIFT-Neur IPS2021.
Open Datasets Yes Tasks and Datasets: We evaluate the robust accuracy and compare our method with the state-of-the-arts on: (i) Sentiment analysis using the IMDB dataset [44]. (ii) Natural language inference using the SNLI dataset [6].
Dataset Splits No The paper mentions using a “testset” for evaluation and “Early stopping is used for all compared methods according to best robust accuracy” which implies a validation set, but it does not specify explicit train/validation/test splits, percentages, or sample counts for the validation portion.
Hardware Specification Yes All experiments are run on one NVIDIA A100 GPU.
Software Dependencies No The paper mentions using “AdamW [43]” as the optimizer, and BERT/RoBERTa models, but does not specify version numbers for any programming languages or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes For both BERT and RoBERTa, we set the initial learning rate as 2e-5 and batch size as 16. We use AdamW [43] as the optimizer... For the training of RIFT, we fine-tune for 10 epochs on IMDB and 5 epochs on SNLI. To train on bigger batch sizes, we apply gradient accumulation by 4. For fair comparisons, all compared adversarial fine-tuning methods use the same β on a same dataset, i.e., β = 10 on IMDB and β = 5 on SNLI... We set τ as 0.2 for all score functions fy. For best robust accuracy, α is chosen as 0.1 and 0.7 on IMDB and SNLI respectively.