reproducibilityindex.ai

Evasion and Hardening of Tree Ensemble Classifiers

Authors: Alex Kantchelian, J. D. Tygar, Anthony Joseph

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting.
Researcher Affiliation	Academia	Alex Kantchelian AKANT@CS.BERKELEY.EDU J. D. Tygar TYGAR@CS.BERKELEY.EDU Anthony D. Joseph ADJ@CS.BERKELEY.EDU University of California, Berkeley
Pseudocode	Yes	Algorithm 1 Coordinate Descent for Problem (1)
Open Source Code	No	The paper discusses the use of third-party tools like XGBoost, scikit-learn, Gurobi, and Theano, but does not provide an explicit statement or link for the authors' own implementation code for the described methodology.
Open Datasets	Yes	We choose digit recognition over the MNIST (Le Cun et al.) dataset as our benchmark classiﬁcation task
Dataset Splits	No	The paper states: 'Our training and testing sets respectively include 11,876 and 1,990 images' and 'tune the hyper-parameters so as to minimize the error on the testing set directly', indicating the test set was used for tuning, but does not specify a separate validation dataset split.
Hardware Specification	Yes	Unlike BDT, BDT-R is extremely challenging to optimally evade using the MILP solver: the branch-andbound search continues to expand nodes after 1 day on a 6 core Xeon 3.2GHz machine.
Software Dependencies	Yes	We use the Gurobi (Gurobi Optimization, 2015) solver to compute the optimal evasions for all distances and all models but NN and RBF-SVM.
Experiment Setup	Yes	Table 1 summarizes the 7 benchmarked models with their salient hyper-parameters and error rates on the testing set. For example: 'BDT 1,000 trees, depth 4, η = 0.02', 'RF 80 trees, max. depth 22', 'RBF-SVM γ = 0.04, C = 1'. Additionally, 'Here, we use B = 28, the size of the picture diagonal, as our budget.'