reproducibilityindex.ai

Revisiting Deep Learning Models for Tabular Data

Authors: Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. ... Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols.
Researcher Affiliation	Collaboration	Yandex, Russia Moscow Institute of Physics and Technology, Russia National Research University Higher School of Economics, Russia
Pseudocode	No	The paper provides mathematical formulations and architectural diagrams of the models, but it does not include explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The source code is available at https://github.com/yandex-research/rtdl.
Open Datasets	Yes	We use a diverse set of eleven public datasets (see supplementary for the detailed description). ... The datasets include: California Housing (CA, real estate data, Kelley Pace and Barry (1997)), Adult (AD, income estimation, Kohavi (1996)), Helena (HE, anonymized dataset, Guyon et al. (2019)), Jannis (JA, anonymized dataset, Guyon et al. (2019)), Higgs (HI, simulated physical particles, Baldi et al. (2014); we use the version with 98K samples available at the Open ML repository (Vanschoren et al., 2014)), ALOI (AL, images, Geusebroek et al. (2005)), Epsilon (EP, simulated physics experiments), Year (YE, audio features, Bertin-Mahieux et al. (2011)), Covertype (CO, forest characteristics, Blackard and Dean. (2000)), Yahoo (YA, search queries, Chapelle and Chang (2011)), Microsoft (MI, search queries, Qin and Liu (2013)).
Dataset Splits	Yes	The dataset is split into three disjoint subsets: D = Dtrain Dval Dtest, where Dtrain is used for training, Dval is used for early stopping and hyperparameter tuning, and Dtest is used for the ﬁnal evaluation. ... For each dataset, there is exactly one train-validation-test split, so all algorithms use the same splits.
Hardware Specification	No	The paper states that information on hardware is provided in the supplementary material, but it does not describe specific hardware details (like GPU/CPU models) in the main text.
Software Dependencies	No	The paper mentions several software libraries used, such as Scikit-learn, Optuna, Adam, and AdamW, but it does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	For every dataset, we carefully tune each model s hyperparameters. ... We minimize cross-entropy for classiﬁcation problems and mean squared error for regression problems. ... For Tab Net and Grow Net, we follow the original implementations and use the Adam optimizer ... For all other algorithms, we use the Adam W optimizer ... We do not apply learning rate schedules. For each dataset, we use a predeﬁned batch size for all algorithms unless special instructions on batch sizes are given in the corresponding papers (see supplementary). We continue training until there are patience + 1 consecutive epochs without improvements on the validation set; we set patience = 16 for all algorithms.