Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Universal consistency and minimax rates for online Mondrian Forests
Authors: Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now turn to the empirical evaluation of our algorithm, and examine its predictive performance (test error) as a function of the training size. More precisely, we compare the modified Mondrian Forest algorithm (Algorithm 4) to batch (Breiman RF [Bre01], Extra-Trees-1 [GEW06]) and online (the Mondrian Forest algorithm [LRT14] with fixed lifetime parameter λ) Random Forests algorithms. We compare the prediction accuracy (on the test set) of the aforementioned algorithms trained on varying fractions of the training data from 10% to 100%. Our results are reported in Figure 1. |
| Researcher Affiliation | Academia | Jaouad Mourtada Centre de Mathématiques Appliquées École Polytechnique, Palaiseau, France EMAIL Stéphane Gaïffas Centre de Mathématiques Appliquées École Polytechnique,Palaiseau, France stéphaneEMAIL Erwan Scornet Centre de Mathématiques Appliquées École Polytechnique,Palaiseau, France EMAIL |
| Pseudocode | Yes | Algorithm 1 Sample Mondrian(λ, C); Algorithm 2 Split Cell(A, τ, λ); Algorithm 3 Extend Mondrian(Mλ, λ, λ ); Algorithm 4 Mondrian Forest(K, (λn)n 1) |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | The paper mentions evaluating on 'several datasets' including the 'dna dataset' in the experiments section and Figure 1, but it does not provide concrete access information (e.g., links, DOIs, specific citations with author/year for public availability) for these datasets. |
| Dataset Splits | No | The paper mentions 'varying fractions of the training data from 10% to 100%' but does not specify exact dataset split information (e.g., percentages, sample counts, or explicit cross-validation setup) needed for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions 'the scikit-learn implementation [PVG+11]' but does not provide specific version numbers for scikit-learn or any other software dependencies. |
| Experiment Setup | Yes | In the case of online Mondrian Forests, we included our modified Mondrian Forest classifier with an increasing lifetime parameter λn = n1/(d+2) tuned according to the theoretical analysis (see Theorem 3), as well as a Mondrian Forest classifier with constant lifetime parameter λ = 2. |