reproducibilityindex.ai

Towards Semantics- and Domain-Aware Adversarial Attacks

Authors: Jianping Zhang, Yung-Chieh Huang, Weibin Wu, Michael R. Lyu

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Compared with state-of-the-art benchmarks, our strategy can achieve over 3% improvement in attack success rates and 9.8% improvement in the quality of adversarial examples. ... Comprehensive experiments confirm the superiority of our approach over state-of-the-art baselines in both the attack success rates and the quality of generated adversarial samples. ... 4 Experiment
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, The Chinese University of Hong Kong 2Department of Computer Science, University of Illinois Urbana-Champaign 3School of Software Engineering, Sun Yat-sen University
Pseudocode	No	The paper describes its algorithms (Iterative Updating Framework, Search Algorithm) in text but does not include any formally labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using "open-source NLP adversarial attack packages Text Attack [Morris et al., 2020] and Open Attack [Zeng et al., 2021]" for baselines, but it does not provide any statement or link for its own source code.
Open Datasets	Yes	For the sentiment analysis task, we choose MR [Pang and Lee, 2005], IMDB [Maas et al., 2011], and SST-2 [Socher et al., 2013], which are widely used datasets tailored for binary sentiment classification. For the natural language inference task, we select MNLI [Williams et al., 2018] and SNLI [Bowman et al., 2015] datasets.
Dataset Splits	No	The paper mentions 'training sets' and 'test set' (Table 2), and fine-tuning 'on the training sets of MNLI and SNLI', and pre-training 'on datasets whose domain is similar to that of the victim model s training data'. However, it does not explicitly provide specific train/validation/test splits (e.g., percentages or sample counts) or cross-validation methodology for its experiments.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU/CPU models, memory, or cloud instance specifications.
Software Dependencies	No	The paper mentions software like BERT, Distil BERT, RoBERTa, Bi LSTM, Text Attack, and Open Attack, but does not specify any version numbers for these or other ancillary software components, such as Python, PyTorch, or TensorFlow versions.
Experiment Setup	Yes	We set the maximum query number to be 300 for the IMDB dataset and 75 for the other datasets due to the difficulty of the IMDB dataset. For our semanticsand domain-aware language model, we choose pre-trained BERT as the architecture. ... We set a threshold of 0.8 for the cosine similarity between USE-based embeddings of the adversarial example and the original input sentence. ... τ is a hyper-parameter that we set to be 0.05.