Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Understanding the Robustness Against Evasion Attack on Categorical Data

Authors: Hongyan Bao, Yufei Han, Yujun Zhou, Yun Shen, Xiangliang Zhang

ICLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Corroborating these theoretical findings with a substantial experimental study over various real-world categorical datasets, we can empirically assess the impact of the key adversarial risk factors over a targeted learning system with categorical inputs." and "4 EXPERIMENTAL STUDY We instantiate the study with standard LSTM based classifiers trained on three multi-class datasets collected from real-world applications of Text analysis, Cyber security and Biomedicine.
Researcher Affiliation Collaboration Hongyan Bao King Abdullah University of Science and Technology EMAIL Yufei Han INRIA EMAIL Yujun Zhou King Abdullah University of Science and Technology EMAIL Yun Shen Net App EMAIL Xiangliang Zhang University of Notre Dame EMAIL" and "The author contributed to this work while at Norton Life Lock.
Pseudocode Yes We give the pseudo codes of FSGS and Rand GS in Algorithm.1 and 2 in Appendix.D." and "The pseudo-codes of OMPGS is presented in Algorithm 3, which explains how it is adopted to solve Eq.4 and Eq.5.
Open Source Code Yes Implementations are available at https://github.com/XYZ211923Y/-Robust XXXXX.
Open Datasets Yes Yelp-5 (Yelp)(Asghar, 2016).", "Intrusion Prevention System Dataset (IPS) (Wang et al., 2020).", "Splice-junction Gene Sequences (Splice) (Noordewier et al., 1991).", "Electronic Health Records (EHR) (Ma et al., 2018).
Dataset Splits No We randomly select 80% of each dataset for training and others for testing." and "We split randomly each dataset into two non-overlapped subsets: 80% of them are used for training and the left 20% form a testing set." The paper explicitly states a training and testing split but does not specify a separate validation split.
Hardware Specification Yes We implement the empirical study using the Python library Py Torch and conduct all the experiments on Linux server with 2 GPUs (Ge Force 1080Ti) and 16-core CPU (Intel Xeon).
Software Dependencies No We implement the empirical study using the Python library Py Torch". The paper mentions PyTorch but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The LSTM-based classifiers with Re Lu activation function and dropout achieve accuracy scores of 0.61, 0.92 and 0.95 respectively for Yelp, IPS and Splice." and "The tolerance threshold Γ is tested on 0.4 and 0 to assess our proposed assessment method with varied tolerance to adversarial threats in safety-critical applications." and "For Rand GS and OMPGS, we empirically set the number of candidate attributes in each iteration of greedy search to be 10 globally for all the datasets.