Learning long-range spatial dependencies with horizontal gated recurrent units

Authors: Drew Linsley, Junkyung Kim, Vijay Veerabadran, Charles Windolf, Thomas Serre

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce a visual challenge, Pathfinder, and describe a novel recurrent neural network architecture called the horizontal gated recurrent unit (h GRU) to learn intrinsic horizontal connections both within and across feature columns. We demonstrate that a single h GRU layer matches or outperforms all tested feedforward hierarchical baselines including state-of-the-art architectures with orders of magnitude more parameters. We performed a large-scale analysis of the effectiveness of feedforward and recurrent computations on the Pathfinder challenge. We tested 6 different recurrent layers... We screened an array of feedforward models... We tested this possibility by training Res Nets with 18, 50, and 152 layers on the Pathfinder challenge. All models were trained for two epochs except for the Res Nets, which were trained for four.
Researcher Affiliation Academia Drew Linsley, Junkyung Kim, Vijay Veerabadran, Charlie Windolf, Thomas Serre Carney Institute for Brain Science Department of Cognitive Linguistic & Psychological Sciences Brown University Providence, RI 02912 {drew_linsley,junkyung_kim,vijay_veerabadran,thomas_serre}@brown.edu
Pseudocode No The paper provides mathematical equations (Eq. 3-9) but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not include any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The Pathfinder challenge consisted of three datasets, in which path and distractor length was successively increased... each contained 1,000,000 unique images... See Supp. Material for a detailed description of the stimulus generation procedure. We also visualized these patterns after training the h GRU to detect contours in the naturalistic BSDS500 image dataset [1].
Dataset Splits No Models were trained on each Pathfinder challenge dataset (Fig. 3d), with 90% of the images used for training (900,000) and the remainder for testing (100,000). While it specifies training and testing, it does not explicitly mention a separate validation split or its size.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used for running its experiments.
Software Dependencies No The paper mentions software like "Tensor Flow" [69] and "PsychoPy Psychophysics software in python" [66] via citations, but does not specify version numbers for these or any other ancillary software components used in their implementation.
Experiment Setup Yes All models were trained for two epochs except for the Res Nets, which were trained for four. Accuracy and ALC were taken from the model that achieved the highest accuracy across 5 separate runs of model training. Recurrent models... had 15x15 horizontal connection kernels (W) with an equal number of channels as their input layer (25 channels). The number of kernels given to each model was varied so that the number of parameters in each model configuration was equal to each other and the h GRU (36, 16, and 9 kernels). Each model was trained from scratch with standard weight initialization [49]... Encoders were the VGG16 [50], and each model was trained from scratch with Xavier initialized weights.