Stable and expressive recurrent vision models

Authors: Drew Linsley, Alekh Karkada Ashok, Lakshmi Narasimhan Govindarajan, Rex Liu, Thomas Serre

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that recurrent vision models trained with C-RBP can detect long-range spatial dependencies in a synthetic contour tracing task that BPTT-trained models cannot. We further show that recurrent vision models trained with C-RBP to solve the large-scale Panoptic Segmentation MS-COCO challenge outperform the leading feedforward approach, with fewer free parameters.
Researcher Affiliation Academia Drew Linsley , Alekh K Ashok , Lakshmi N Govindarajan , Rex Liu, Thomas Serre Carney Institute for Brain Science Department of Cognitive Linguistic & Psychological Sciences Brown University Providence, RI 02912 {drew_linsley,alekh_ashok,lakshmi_govindarajan, rex_liu,thomas_serre}@brown.edu
Pseudocode No Information insufficient. The paper does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code Yes Code and data are available at https://github.com/c-rbp.
Open Datasets Yes We further show that recurrent vision models trained with C-RBP to solve the large-scale Panoptic Segmentation MS-COCO challenge outperform the leading feedforward approach, with fewer free parameters. C-RBP is a general-purpose learning algorithm for any application that can benefit from expansive recurrent dynamics. Code and data are available at https://github.com/c-rbp. Training lasted 20 epochs on a dataset of 200,000 images.
Dataset Splits Yes Model predictions on the COCO validation set were scored with Panoptic Quality (PQ), which is the product of metrics for semantic (Io U) and instance (F1 score) segmentation [41].
Hardware Specification Yes We began by testing four versions of the h GRU: one trained with BPTT for 6 steps, which was the most that could fit into the 12GB memory of the NVIDIA Titan X GPUs used for this experiment; and versions trained with RBP for 6, 20, and 30 steps. Models were trained to optimize both of these objectives with SGD+Momentum, a learning rate of 5e-2, and batches of 40 images across 24GB NVIDIA GTX GPUs (10 total).
Software Dependencies No Information insufficient. The paper mentions training with 'Adam [35]' but does not specify software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) with their version numbers, or other relevant libraries and their specific versions.
Experiment Setup Yes The models were trained with Adam [35] and a learning rate of 3e-4 to minimize average per-pixel cross entropy on batches of 32 images. Training lasted 20 epochs on a dataset of 200,000 images. Models were trained to optimize both of these objectives with SGD+Momentum, a learning rate of 5e-2, and batches of 40 images across 24GB NVIDIA GTX GPUs (10 total).