Universal consistency and minimax rates for online Mondrian Forests

Authors: Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now turn to the empirical evaluation of our algorithm, and examine its predictive performance (test error) as a function of the training size. More precisely, we compare the modified Mondrian Forest algorithm (Algorithm 4) to batch (Breiman RF [Bre01], Extra-Trees-1 [GEW06]) and online (the Mondrian Forest algorithm [LRT14] with fixed lifetime parameter λ) Random Forests algorithms. We compare the prediction accuracy (on the test set) of the aforementioned algorithms trained on varying fractions of the training data from 10% to 100%. Our results are reported in Figure 1.
Researcher Affiliation Academia Jaouad Mourtada Centre de Mathématiques Appliquées École Polytechnique, Palaiseau, France jaouad.mourtada@polytechnique.edu Stéphane Gaïffas Centre de Mathématiques Appliquées École Polytechnique,Palaiseau, France stéphane.gaiffas@polytechnique.edu Erwan Scornet Centre de Mathématiques Appliquées École Polytechnique,Palaiseau, France erwan.scornet@polytechnique.edu
Pseudocode Yes Algorithm 1 Sample Mondrian(λ, C); Algorithm 2 Split Cell(A, τ, λ); Algorithm 3 Extend Mondrian(Mλ, λ, λ ); Algorithm 4 Mondrian Forest(K, (λn)n 1)
Open Source Code No The paper does not provide any concrete access to source code for the methodology described.
Open Datasets No The paper mentions evaluating on 'several datasets' including the 'dna dataset' in the experiments section and Figure 1, but it does not provide concrete access information (e.g., links, DOIs, specific citations with author/year for public availability) for these datasets.
Dataset Splits No The paper mentions 'varying fractions of the training data from 10% to 100%' but does not specify exact dataset split information (e.g., percentages, sample counts, or explicit cross-validation setup) needed for reproducibility.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper mentions 'the scikit-learn implementation [PVG+11]' but does not provide specific version numbers for scikit-learn or any other software dependencies.
Experiment Setup Yes In the case of online Mondrian Forests, we included our modified Mondrian Forest classifier with an increasing lifetime parameter λn = n1/(d+2) tuned according to the theoretical analysis (see Theorem 3), as well as a Mondrian Forest classifier with constant lifetime parameter λ = 2.