Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Evasion and Hardening of Tree Ensemble Classifiers

Authors: Alex Kantchelian, J. D. Tygar, Anthony Joseph

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting.
Researcher Affiliation Academia Alex Kantchelian EMAIL J. D. Tygar EMAIL Anthony D. Joseph EMAIL University of California, Berkeley
Pseudocode Yes Algorithm 1 Coordinate Descent for Problem (1)
Open Source Code No The paper discusses the use of third-party tools like XGBoost, scikit-learn, Gurobi, and Theano, but does not provide an explicit statement or link for the authors' own implementation code for the described methodology.
Open Datasets Yes We choose digit recognition over the MNIST (Le Cun et al.) dataset as our benchmark classification task
Dataset Splits No The paper states: 'Our training and testing sets respectively include 11,876 and 1,990 images' and 'tune the hyper-parameters so as to minimize the error on the testing set directly', indicating the test set was used for tuning, but does not specify a separate validation dataset split.
Hardware Specification Yes Unlike BDT, BDT-R is extremely challenging to optimally evade using the MILP solver: the branch-andbound search continues to expand nodes after 1 day on a 6 core Xeon 3.2GHz machine.
Software Dependencies Yes We use the Gurobi (Gurobi Optimization, 2015) solver to compute the optimal evasions for all distances and all models but NN and RBF-SVM.
Experiment Setup Yes Table 1 summarizes the 7 benchmarked models with their salient hyper-parameters and error rates on the testing set. For example: 'BDT 1,000 trees, depth 4, η = 0.02', 'RF 80 trees, max. depth 22', 'RBF-SVM γ = 0.04, C = 1'. Additionally, 'Here, we use B = 28, the size of the picture diagonal, as our budget.'