Scaling Down Deep Learning with MNIST-1D

Authors: Samuel James Greydanus, Dmitry Kobak

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Although the dimensionality of MNIST-1D is only 40 and its default training set size only 4000, MNIST-1D can be used to study inductive biases of different deep architectures, find lottery tickets, observe deep double descent, metalearn an activation function, and demonstrate guillotine regularization in self-supervised learning. All these experiments can be conducted on a GPU or often even on a CPU within minutes, allowing for fast prototyping, educational use cases, and cutting-edge research on a low budget.
Researcher Affiliation Collaboration Sam Greydanus 1 2 Dmitry Kobak 3 4 1Oregon State University, USA 2The ML Collective 3University of Tübingen, Germany 4Heidelberg University, Germany.
Pseudocode No The paper describes procedures and implementations but does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes All our experiments are in Jupyter notebooks and are available at https://github.com/greydanus/mnist1d, with direct links from figure captions. We provide a mnist1d package that can be installed via pip install mnist1d.
Open Datasets Yes Although the dimensionality of MNIST-1D is only 40 and its default training set size only 4000, MNIST-1D can be used to study inductive biases of different deep architectures... Also: The deep learning analogue of Drosophila melanogaster is the MNIST dataset. And: We observed double descent when training a MLP classifier on MNIST-1D... And: The frozen dataset with 4000 + 1000 samples can be found on Git Hub as mnist1d data.pkl.
Dataset Splits Yes Table 2: Default parameters for MNIST-1D generation. Parameter Value Train/test split 4000/1000
Hardware Specification No The paper states that experiments can be run 'on a GPU or often even on a CPU within minutes' and provides 'CPU runtime' for various experiments (e.g., 'CPU runtime: 10 minutes', 'CPU runtime: 60 minutes', '1 minute on a CPU', '1 hour of CPU runtime'). However, it does not specify any exact GPU or CPU models, or other hardware specifications.
Software Dependencies No The paper mentions software components such as 'PyTorch' and 'Adam optimizer', and indicates that experiments are in 'Jupyter notebooks'. It also mentions 'pip install mnist1d' for their package. However, it does not provide specific version numbers for these software dependencies (e.g., 'PyTorch 1.x' or 'Python 3.x').
Experiment Setup Yes Table 2: Default parameters for MNIST-1D generation. Parameter Value Train/test split 4000/1000 Template length 12 Padding points 36–60 Max. translation 48 Gaussian filter width 2 Gaussian noise scale 0.25 White noise scale 0.02 Shear scale 0.75 Final seq. length 40 Random seed 42. Also We used PyTorch to implement and train simple logistic, MLP (fully-connected), CNN (with 1D convolutions), and GRU (gated recurrent unit) models. We used the Adam optimizer and early stopping for model selection and evaluation.