Towards Robustness Against Natural Language Word Substitutions
Authors: Xinshuai Dong, Anh Tuan Luu, Rongrong Ji, Hong Liu
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that ASCC-defense outperforms the current state-of-the-arts in terms of robustness on two prevailing NLP tasks, i.e., sentiment analysis and natural language inference, concerning several attacks across multiple model architectures. Experimental results show that our method consistently yields models that are more robust than the state-of-the-arts with significant margins; e.g., we achieve 79.0% accuracy under Genetic attacks on IMDB while the state-of-the-art performance is 75.0%. |
| Researcher Affiliation | Collaboration | Xinshuai Dong Nanyang Technological University, Singapore dongxinshuai@outlook.com Anh Tuan Luu Nanyang Technological University, Singapore Vin AI Research, Vietnam anhtuan.luu@ntu.edu.sg Rongrong Ji Xiamen University, China rrji@xmu.edu.cn Hong Liu National Institute of Informatics, Japan hliu@nii.ac.jp |
| Pseudocode | Yes | Algorithm 1 ASCC-defense Input: dataset D, parameters of Adam optimizer. Output: parameters θ and φ. 1: repeat 2: for random mini-batch D do 3: for every x, y in the mini-batch (in parallel) do 4: Solve the inner maximization in Eq.11 to find the optimal ˆw by Adam; 5: Compute ˆv(x) by Eq.10 using ˆw and then compute the inner-maximum in Eq.11; 6: end for 7: Update θ and φ by Adam to minimize the calculated inner-maximum; 8: end for 9: until the training converges. |
| Open Source Code | Yes | Our code will be available at https://github.com/dongxinshuai/ASCC. |
| Open Datasets | Yes | Tasks and datasets. We focus on two prevailing NLP tasks to evaluate the robustness and compare our method to the state-of-the-arts: (i) Sentiment analysis on the IMDB dataset (Maas et al., 2011). (ii) Natural language inference on the SNLI dataset (Bowman et al., 2015). |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide training/validation/test dataset splits with specific percentages, counts, or references to predefined validation splits. |
| Hardware Specification | Yes | All models are trained using the Ge Force GTX1080 GPU. |
| Software Dependencies | No | The paper mentions software like 'NLTK' and 'Adam optimizer' but does not provide specific version numbers for these or any other key software dependencies required for reproducibility. |
| Experiment Setup | Yes | We set α as 10 and β as 4 for the training procedure defined in Eq.12. To generate adversaries for robust training, we employ Adam optimizer with a learning rate of 10 and a weight decay of 0.00002 to run for 10 iterations To update φ and θ, we also employ Adam optimizer, the parameters of which differ between architectures and will be discussed as follows. Architecture parameters (i) CNN for IMDB: We use a 1-d convolutional layer with kernal size of 3 to extract features and then make predictions. We set the batch-size as 64 and use Adam optimizer with a learning rate of 0.005 and a weight decay of 0.0002. (ii) Bi-LSTM for IMDB: We use a bi-directional LSTM layer to process the input sequence, and then use the last hidden state to make predictions. We set the batch-size as 64 and use Adam optimizer with a learning rate of 0.005 and a weight decay of 0.0002. (iii) BOW for SNLI: We first sum up the word vectors at the dimension of sequence and concat the encoding of the premise and the hypothesis. Then we employ a MLP of 3 layers to predict the label. We set the batch-size as 512 and use Adam optimizer with a learning rate of 0.0005 and a weight decay of 0.0002. (iv) DECOMPATTN for SNLI: We first generates context-aware vectors and then employ a MLP of 2 layers to make predictions given the contextaware vectors. We set the batch-size as 256 and use Adam with a learning rate of 0.0005 and a weight decay of 0. |