Abstract Interpretation of Decision Tree Ensemble Classifiers

Authors: Francesco Ranzato, Marco Zanella5478-5486

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluation on the MNIST dataset shows that silva provides a precise and efficient tool which advances the current state of the art in tree ensembles verification. Our experiments were run on a AMD Ryzen 7 1700X 3.0GHz CPU.
Researcher Affiliation Academia Francesco Ranzato, Marco Zanella Dipartimento di Matematica, University of Padova, Italy {ranzato, mzanella}@math.unipd.it
Pseudocode Yes Algorithm 1 describes in pseudocode our stability verification methodology.
Open Source Code Yes We implemented Algorithm 1 in a tool called silva whose source code in C (about 5K LOC) is available on Git Hub (Ranzato and Zanella 2019).
Open Datasets Yes Our experimental evaluation on the MNIST dataset... The standard training set of MNIST consists of 60000 samples, while its test set T includes the remaining 10000 samples. We also used silva on RFs and GBDTs trained on the Sensorless dataset from the UCI ML Repository
Dataset Splits No The paper specifies training and test sets for MNIST and Sensorless but does not explicitly mention a validation set or provide details on how validation was performed.
Hardware Specification Yes Our experiments were run on a AMD Ryzen 7 1700X 3.0GHz CPU.
Software Dependencies No RFs have been trained by scikit-learn while Cat Boost has been used for GBDTs. We implemented Algorithm 1 in a tool called silva whose source code in C (about 5K LOC) is available on Git Hub (Ranzato and Zanella 2019). The paper mentions software used (scikit-learn, CatBoost, C) but does not provide specific version numbers for these dependencies.
Experiment Setup Yes Table 1 shows the accuracy and stability percentages and the total verification time on the whole test set of MNIST for different random forest classifiers trained by combining 4 parameters: number B of decision trees, maximum tree depth d, training criterion (Gini and entropy) and voting scheme (max and average). We considered the standard perturbation P ,ϵ (Carlini and Wagner 2017) with ϵ = 1.