TextShield: Beyond Successfully Detecting Adversarial Sentences in text classification
Authors: Lingfeng Shen, Ze Zhang, Haiyun Jiang, Ying Chen
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments show that (a) Text Shield consistently achieves higher or comparable performance than state-of-the-art defense methods across various attacks on different benchmarks. (b) our saliency-based detector outperforms existing detectors for detecting adversary. |
| Researcher Affiliation | Collaboration | 1Johns Hopkins University 2Tsinghua University 3Tencent AI Lab 4College of Information and Electrical Engineering, China Agricultural University |
| Pseudocode | No | The paper describes procedures and equations but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or a link to its open-source code for the described methodology. |
| Open Datasets | Yes | Moreover, we choose three popular benchmarks in text classification: IMDB (Potts, 2010), AG s News (Zhang et al., 2015) and Yahoo! Answers (Zhang et al., 2015). |
| Dataset Splits | Yes | Balanced data setup: The adversarial data and the same-size benign data are mixed as balanced data. Then, the balanced data is split into train-dev-test sets with a 7:2:1 proportion. |
| Hardware Specification | Yes | On one RTX3090 GPU, the victim model is selected as BERT-base-uncased. |
| Software Dependencies | No | The paper mentions specific tools and models like NLTK (Loper & Bird, 2002), Text CNN (Kim, 2014), LSTM (Hochreiter & Schmidhuber, 1997), and BERT (Devlin et al., 2019) and the Adam optimizer, but it does not specify version numbers for general software dependencies or programming languages (e.g., Python, PyTorch versions) needed for reproduction. |
| Experiment Setup | Yes | The learnable parameters in our saliency-based detectors are the ones in the four LSTMs of the detector and the two-layer MLP of the combiner. After tokenization, we either conduct padding with max length=128 or do a truncation for each input sentence... hidden size = 128... input dim = 128, intermediate layer dim = 64 and out dim = 2. In addition, the LSTMs and the two-layer MLP are simultaneously trained through the Adam optimizer with a 5 e 4 learning rate... Setting the batch size as 8. |