GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data
Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets. |
| Researcher Affiliation | Academia | Sascha Marton University of Mannheim, Germany sascha.marton@uni-mannheim.de Stefan L udtke University of Rostock, Germany stefan.luedtke@uni-rostock.de Christian Bartelt University of Mannheim, Germany christian.bartelt@uni-mannheim.de Heiner Stuckenschmidt University of Mannheim, Germany heiner.stuckenschmidt@uni-mannheim.de |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The method is available under: https://github.com/s-marton/GRANDE |
| Open Datasets | Yes | For our evaluation, we used a predefined collection of datasets that was selected based on objective criteria from Open ML Benchmark Suites and comprises a total of 19 binary classification datasets (see Table 5 for details). The selection process was adopted from Bischl et al. (2021) |
| Dataset Splits | Yes | Furthermore, we report the mean and standard deviation of the test performance over a 5-fold cross-validation to ensure reliable results. |
| Hardware Specification | Yes | For all methods, we used a single NVIDIA RTX A6000. |
| Software Dependencies | No | The paper mentions using Optuna for hyperparameter optimization and frameworks like XGBoost and Cat Boost, but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For GRANDE, we used a batch size of 64 and early stopping after 25 epochs. Similar to NODE Popov et al. (2019), GRANDE uses an Adam optimizer with stochastic weight averaging over 5 checkpoints (Izmailov et al., 2018) and a learning rate schedule that uses a cosine decay with optional warmup (Loshchilov & Hutter, 2016). We optimized the hyperparameters using Optuna (Akiba et al., 2019) with 250 trials... |