Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Evasion and Hardening of Tree Ensemble Classifiers
Authors: Alex Kantchelian, J. D. Tygar, Anthony Joseph
ICML 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting. |
| Researcher Affiliation | Academia | Alex Kantchelian EMAIL J. D. Tygar EMAIL Anthony D. Joseph EMAIL University of California, Berkeley |
| Pseudocode | Yes | Algorithm 1 Coordinate Descent for Problem (1) |
| Open Source Code | No | The paper discusses the use of third-party tools like XGBoost, scikit-learn, Gurobi, and Theano, but does not provide an explicit statement or link for the authors' own implementation code for the described methodology. |
| Open Datasets | Yes | We choose digit recognition over the MNIST (Le Cun et al.) dataset as our benchmark classification task |
| Dataset Splits | No | The paper states: 'Our training and testing sets respectively include 11,876 and 1,990 images' and 'tune the hyper-parameters so as to minimize the error on the testing set directly', indicating the test set was used for tuning, but does not specify a separate validation dataset split. |
| Hardware Specification | Yes | Unlike BDT, BDT-R is extremely challenging to optimally evade using the MILP solver: the branch-andbound search continues to expand nodes after 1 day on a 6 core Xeon 3.2GHz machine. |
| Software Dependencies | Yes | We use the Gurobi (Gurobi Optimization, 2015) solver to compute the optimal evasions for all distances and all models but NN and RBF-SVM. |
| Experiment Setup | Yes | Table 1 summarizes the 7 benchmarked models with their salient hyper-parameters and error rates on the testing set. For example: 'BDT 1,000 trees, depth 4, η = 0.02', 'RF 80 trees, max. depth 22', 'RBF-SVM γ = 0.04, C = 1'. Additionally, 'Here, we use B = 28, the size of the picture diagonal, as our budget.' |