Universal consistency and minimax rates for online Mondrian Forests
Authors: Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now turn to the empirical evaluation of our algorithm, and examine its predictive performance (test error) as a function of the training size. More precisely, we compare the modified Mondrian Forest algorithm (Algorithm 4) to batch (Breiman RF [Bre01], Extra-Trees-1 [GEW06]) and online (the Mondrian Forest algorithm [LRT14] with fixed lifetime parameter λ) Random Forests algorithms. We compare the prediction accuracy (on the test set) of the aforementioned algorithms trained on varying fractions of the training data from 10% to 100%. Our results are reported in Figure 1. |
| Researcher Affiliation | Academia | Jaouad Mourtada Centre de Mathématiques Appliquées École Polytechnique, Palaiseau, France jaouad.mourtada@polytechnique.edu Stéphane Gaïffas Centre de Mathématiques Appliquées École Polytechnique,Palaiseau, France stéphane.gaiffas@polytechnique.edu Erwan Scornet Centre de Mathématiques Appliquées École Polytechnique,Palaiseau, France erwan.scornet@polytechnique.edu |
| Pseudocode | Yes | Algorithm 1 Sample Mondrian(λ, C); Algorithm 2 Split Cell(A, τ, λ); Algorithm 3 Extend Mondrian(Mλ, λ, λ ); Algorithm 4 Mondrian Forest(K, (λn)n 1) |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | The paper mentions evaluating on 'several datasets' including the 'dna dataset' in the experiments section and Figure 1, but it does not provide concrete access information (e.g., links, DOIs, specific citations with author/year for public availability) for these datasets. |
| Dataset Splits | No | The paper mentions 'varying fractions of the training data from 10% to 100%' but does not specify exact dataset split information (e.g., percentages, sample counts, or explicit cross-validation setup) needed for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions 'the scikit-learn implementation [PVG+11]' but does not provide specific version numbers for scikit-learn or any other software dependencies. |
| Experiment Setup | Yes | In the case of online Mondrian Forests, we included our modified Mondrian Forest classifier with an increasing lifetime parameter λn = n1/(d+2) tuned according to the theoretical analysis (see Theorem 3), as well as a Mondrian Forest classifier with constant lifetime parameter λ = 2. |