Well-tuned Simple Nets Excel on Tabular Datasets
Authors: Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically assess the impact of these regularization cocktails for MLPs in a large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost. |
| Researcher Affiliation | Collaboration | Arlind Kadra Department of Computer Science University of Freiburg kadraa@cs.uni-freiburg.de; Marius Lindauer Institute for Information Processing Leibniz University Hannover lindauer@tnt.uni-hannover.de; Frank Hutter Department of Computer Science University of Freiburg & Bosch Center for Artificial Intelligence fh@cs.uni-freiburg.de; Josif Grabocka Department of Computer Science University of Freiburg grabocka@informatik.uni-freiburg.de |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | We provide the code for our implementation at the following link: https://github.com/releaunifreiburg/ Well Tuned Simple Nets. |
| Open Datasets | Yes | We use a large collection of 40 tabular datasets (listed in Table 9 of Appendix D). This includes 31 datasets from the recent open-source Open ML Auto ML Benchmark [16]2. In addition, we added 9 popular datasets from UCI [3] and Kaggle that contain roughly 100K+ instances. ... The datasets are retrieved from the Open ML repository [54] using the Open ML-Python connector [14]... |
| Dataset Splits | Yes | The datasets are retrieved from the Open ML repository [54] using the Open ML-Python connector [14] and split as 60% training, 20% validation, and 20% testing sets. |
| Hardware Specification | Yes | We ran all experiments on a CPU cluster, each node of which contains two Intel Xeon E5-2630v4 CPUs with 20 CPU cores each, running at 2.2GHz and a total memory of 128GB. |
| Software Dependencies | No | The paper mentions using "Py Torch library [43]" and "Auto DL-framework Auto-Pytorch [39, 62]" but does not specify their version numbers. |
| Experiment Setup | Yes | In order to focus exclusively on investigating the effect of regularization we fix the neural architecture to a simple multilayer perceptron (MLP) and also fix some hyperparameters of the general training procedure. These fixed hyperparameter values, as specified in Table 4 of Appendix B.1... We use a 9-layer feed-forward neural network with 512 units for each layer... We set a low learning rate of 10-3... We use Adam W [36]... and cosine annealing with restarts [35] as a learning rate scheduler. For the restarts, we use an initial budget of 15 epochs, with a budget multiplier of 2... |