reproducibilityindex.ai

Analyzing the tree-layer structure of Deep Forests

Authors: Ludovic Arnould, Claire Boyer, Erwan Scornet

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, our aim is not to benchmark DF performances but to investigate instead their underlying mechanisms. Additionally, DF architecture can be generally simpliﬁed into more simple and computationally efﬁcient shallow forests networks. Despite some instability, the latter may outperform standard predictive tree-based methods. We exhibit a theoretical framework in which a shallow tree network is shown to enhance the performance of classical decision trees. In such a setting, we provide tight theoretical lower and upper bounds on its excess risk.
Researcher Affiliation	Academia	1LPSM, Sorbonne Universite 2CMAP, Ecole Polytechnique.
Pseudocode	No	The paper describes algorithms and procedures in text but does not include any formal pseudocode blocks or sections explicitly labeled 'Algorithm'.
Open Source Code	Yes	For reproducibility purposes, all codes together with all experimental procedures are to be found in the supplementary materials.
Open Datasets	Yes	We compare different conﬁgurations of DF on six datasets in which the output is binary, multi-class or continuous, see Table 1 for description. All classiﬁcation datasets belong to the UCI repository, the two regression ones are Kaggle datasets (Housing data and Airbnb Berlin 2020)1. (...) 1https://www.kaggle.com/raghavs1003/airbnb-berlin-2020 https://www.kaggle.com/c/house-prices-advanced-regressiontechniques/data
Dataset Splits	Yes	Dataset Type (Nb of classes) Train/Val/Test Size Dim Adult Class. (2) 26048/ 6512/ 16281 14 Higgs Class. (2) 120000/ 28000/ 60000 28 Fashion Mnist Class (10) 24000/ 6000/ 8000 260 Letter Class. (26) 12800/ 3200/ 4000 16 Yeast Class. (10) 830/ 208/ 446 8 Airbnb Regr. 73044/ 18262/ 39132 13 Housing Regr. 817/ 205/ 438 61
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., CPU, GPU models, memory) to run its experiments.
Software Dependencies	No	other forest parameters are set to sk-learn (Pedregosa et al., 2011) default values
Experiment Setup	Yes	DF hyperparameters Deep Forests contain an important number of tuning parameters. Apart from the traditional parameters of random forests, DF architecture depends on the number of layers, the number of forests per layer, the type and proportion of random forests to use (Breiman or CRF). In Zhou & Feng (2017), the default conﬁguration is set to 8 forests per layer, 4 CRF and 4 RF, 500 trees per forest (other forest parameters are set to sk-learn (Pedregosa et al., 2011) default values), and layers are added until 3 consecutive layers do not show score improvement.