An Efficient Adversarial Attack for Tree Ensembles

Authors: Chong Zhang, Huan Zhang, Cho-Jui Hsieh

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on several large GBDT and RF models with up to hundreds of trees demonstrate that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach, while also providing smaller (better) adversarial examples than decision-based black-box attacks on general ℓp (p = 1, 2, ∞) norm perturbations. Our code is available at https://github.com/ chong-z/tree-ensemble-attack.
Researcher Affiliation Academia Chong Zhang Huan Zhang Cho-Jui Hsieh Department of Computer Science, UCLA chongz@cs.ucla.edu, huan@huan-zhang.com, chohsieh@cs.ucla.edu
Pseudocode Yes Algorithm 1: Our proposed LT-Attack for constructing adversarial examples.
Open Source Code Yes Our code is available at https://github.com/ chong-z/tree-ensemble-attack.
Open Datasets Yes We evaluate the proposed algorithm on 9 public datasets (Smith et al., 1988; Lecun et al., 1998; Chang, Lin, 2011; Wang et al., 2012; Baldi et al., 2014; Xiao et al., 2017; Dua, Graff, 2017) with both the standard (natural) GBDT and RF models, and on an additional 10th dataset (Bosch, 2016) with the robustly trained GBDT.
Dataset Splits No The paper refers to 'training data size' and 'test examples' but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification No The paper mentions '20 threads per task' and discusses the runtime of various methods, but it does not provide specific hardware details such as CPU/GPU models or memory specifications used for running the experiments.
Software Dependencies No The paper mentions the use of 'XGBoost framework' and 'Gurobi Solver' but does not specify their version numbers, which is required for reproducibility.
Experiment Setup Yes For the initial point, we draw 20 random initial adversarial examples from a Gaussian distribution, and optimize with a fine-grained binary search before feeding to the proposed algorithm. We return the best adversarial example found among them (see Appendix D.1 for details). ... To escape converged local minimums we change each coordinate to a nearby value from Gaussian distribution with 0.1 probability, and continue the iteration if a better adversarial example was found within 300 trials.