reproducibilityindex.ai

Scaling MLPs: A Tale of Inductive Bias

Authors: Gregor Bachmann, Sotiris Anagnostidis, Thomas Hofmann

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that the performance of MLPs drastically improves with scale (95% on CIFAR10, 82% on CIFAR100, 58% on Image Net Rea L), highlighting that lack of inductive bias can indeed be compensated.
Researcher Affiliation	Academia	Gregor Bachmann , Sotiris Anagnostidis , Thomas Hofmann ETH Zürich, Switzerland
Pseudocode	Yes	D Inverted Bottleneck MLP Code: We provide PyTorch-style pseudo-code for the inverted bottleneck MLP to highlight its simplicity.
Open Source Code	Yes	Code and checkpoints available at https://github.com/gregorbachmann/scaling_mlps
Open Datasets	Yes	We study the popular tasks CIFAR10, CIFAR100 (Krizhevsky, 2009), STL10 (Coates et al., 2011), Tiny Image Net (Le and Yang, 2015), Image Net1k for evaluation, as well as Image Net21k (Deng et al., 2009) for pre-training.
Dataset Splits	No	The paper mentions pre-training and fine-tuning on various datasets, and evaluates test error, but does not explicitly specify the splitting methodology or proportions for training, validation, and test sets. It does not use the term 'validation' in the context of dataset splits for reproduction.
Hardware Specification	Yes	All of our experiments were conducted on a single NVIDIA RTX A5000 GPU with 24GB of memory.
Software Dependencies	No	The paper mentions using PyTorch, SciPy library, FFCV framework, and the LION optimizer, but it does not specify version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup	Yes	All models were trained with the LION optimizer (Chen et al., 2023) with a learning rate η = 5e-5. In order to combat overfitting we use strong label smoothing α = 0.3. We center and normalize all the images and use random flips and crops as well as Mix Up (Zhang et al., 2018) as data augmentations.