The Uncanny Similarity of Recurrence and Depth

Authors: Avi Schwarzschild, Arjun Gupta, Amin Ghiasi, Micah Goldblum, Tom Goldstein

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By training models of various feed-forward and recurrent architectures on several datasets for image classification as well as maze solving, we show that recurrent networks have the ability to closely emulate the behavior of non-recurrent deep models, often doing so with far fewer parameters.
Researcher Affiliation Academia Avi Schwarzschild Department of Mathematics University of Maryland College Park, MD, USA avi1@umd.edu Arjun Gupta Department of Robotics University of Maryland College Park, MD, USA arjung15@umd.edu Amin Ghiasi Department of Computer Science University of Maryland College Park, MD, USA Micah Goldblum Department of Computer Science University of Maryland College Park, MD, USA Tom Goldstein Department of Computer Science University of Maryland College Park, MD, USA
Pseudocode No The paper describes methods in prose and does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code for reproducing the experiments in this paper is available in the code repository at https://github.com/Arjung27/Deep Thinking.
Open Datasets Yes We train and test models of varying effective depths on Image Net, CIFAR-10, EMNIST, and SVHN to study the relationship between depth and recurrence (Russakovsky et al., 2015; Krizhevsky, 2009; Cohen et al., 2017; Netzer et al., 2011).
Dataset Splits No For the maze dataset, the paper states 'We split each set into 50,000 training samples and 10,000 testing samples.' For other datasets, it mentions 'We train and test models...' but does not provide explicit validation set split percentages or sample counts in the main text or appendices.
Hardware Specification Yes All of our experiments are done on Nvidia Ge Force RTX 2080Ti GPUs. All training, except for Image Net models, is done on a single GPU. We train models on Image Net data using four GPUs at a time.
Software Dependencies No The paper mentions that code for reproducing experiments is available but does not explicitly list software dependencies with version numbers (e.g., Python, PyTorch versions) in the text.
Experiment Setup Yes For specific training hyperparameters for every experiment, see Appendix A.2.1