An Efficient Adversarial Attack for Tree Ensembles
Authors: Chong Zhang, Huan Zhang, Cho-Jui Hsieh
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on several large GBDT and RF models with up to hundreds of trees demonstrate that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach, while also providing smaller (better) adversarial examples than decision-based black-box attacks on general ℓp (p = 1, 2, ∞) norm perturbations. Our code is available at https://github.com/ chong-z/tree-ensemble-attack. |
| Researcher Affiliation | Academia | Chong Zhang Huan Zhang Cho-Jui Hsieh Department of Computer Science, UCLA chongz@cs.ucla.edu, huan@huan-zhang.com, chohsieh@cs.ucla.edu |
| Pseudocode | Yes | Algorithm 1: Our proposed LT-Attack for constructing adversarial examples. |
| Open Source Code | Yes | Our code is available at https://github.com/ chong-z/tree-ensemble-attack. |
| Open Datasets | Yes | We evaluate the proposed algorithm on 9 public datasets (Smith et al., 1988; Lecun et al., 1998; Chang, Lin, 2011; Wang et al., 2012; Baldi et al., 2014; Xiao et al., 2017; Dua, Graff, 2017) with both the standard (natural) GBDT and RF models, and on an additional 10th dataset (Bosch, 2016) with the robustly trained GBDT. |
| Dataset Splits | No | The paper refers to 'training data size' and 'test examples' but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility. |
| Hardware Specification | No | The paper mentions '20 threads per task' and discusses the runtime of various methods, but it does not provide specific hardware details such as CPU/GPU models or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'XGBoost framework' and 'Gurobi Solver' but does not specify their version numbers, which is required for reproducibility. |
| Experiment Setup | Yes | For the initial point, we draw 20 random initial adversarial examples from a Gaussian distribution, and optimize with a fine-grained binary search before feeding to the proposed algorithm. We return the best adversarial example found among them (see Appendix D.1 for details). ... To escape converged local minimums we change each coordinate to a nearby value from Gaussian distribution with 0.1 probability, and continue the iteration if a better adversarial example was found within 300 trials. |