reproducibilityindex.ai

GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data

Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets.
Researcher Affiliation	Academia	Sascha Marton University of Mannheim, Germany sascha.marton@uni-mannheim.de Stefan L udtke University of Rostock, Germany stefan.luedtke@uni-rostock.de Christian Bartelt University of Mannheim, Germany christian.bartelt@uni-mannheim.de Heiner Stuckenschmidt University of Mannheim, Germany heiner.stuckenschmidt@uni-mannheim.de
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The method is available under: https://github.com/s-marton/GRANDE
Open Datasets	Yes	For our evaluation, we used a predefined collection of datasets that was selected based on objective criteria from Open ML Benchmark Suites and comprises a total of 19 binary classification datasets (see Table 5 for details). The selection process was adopted from Bischl et al. (2021)
Dataset Splits	Yes	Furthermore, we report the mean and standard deviation of the test performance over a 5-fold cross-validation to ensure reliable results.
Hardware Specification	Yes	For all methods, we used a single NVIDIA RTX A6000.
Software Dependencies	No	The paper mentions using Optuna for hyperparameter optimization and frameworks like XGBoost and Cat Boost, but does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	For GRANDE, we used a batch size of 64 and early stopping after 25 epochs. Similar to NODE Popov et al. (2019), GRANDE uses an Adam optimizer with stochastic weight averaging over 5 checkpoints (Izmailov et al., 2018) and a learning rate schedule that uses a cosine decay with optional warmup (Loshchilov & Hutter, 2016). We optimized the hyperparameters using Optuna (Akiba et al., 2019) with 250 trials...