Stable and expressive recurrent vision models
Authors: Drew Linsley, Alekh Karkada Ashok, Lakshmi Narasimhan Govindarajan, Rex Liu, Thomas Serre
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that recurrent vision models trained with C-RBP can detect long-range spatial dependencies in a synthetic contour tracing task that BPTT-trained models cannot. We further show that recurrent vision models trained with C-RBP to solve the large-scale Panoptic Segmentation MS-COCO challenge outperform the leading feedforward approach, with fewer free parameters. |
| Researcher Affiliation | Academia | Drew Linsley , Alekh K Ashok , Lakshmi N Govindarajan , Rex Liu, Thomas Serre Carney Institute for Brain Science Department of Cognitive Linguistic & Psychological Sciences Brown University Providence, RI 02912 {drew_linsley,alekh_ashok,lakshmi_govindarajan, rex_liu,thomas_serre}@brown.edu |
| Pseudocode | No | Information insufficient. The paper does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | Yes | Code and data are available at https://github.com/c-rbp. |
| Open Datasets | Yes | We further show that recurrent vision models trained with C-RBP to solve the large-scale Panoptic Segmentation MS-COCO challenge outperform the leading feedforward approach, with fewer free parameters. C-RBP is a general-purpose learning algorithm for any application that can benefit from expansive recurrent dynamics. Code and data are available at https://github.com/c-rbp. Training lasted 20 epochs on a dataset of 200,000 images. |
| Dataset Splits | Yes | Model predictions on the COCO validation set were scored with Panoptic Quality (PQ), which is the product of metrics for semantic (Io U) and instance (F1 score) segmentation [41]. |
| Hardware Specification | Yes | We began by testing four versions of the h GRU: one trained with BPTT for 6 steps, which was the most that could fit into the 12GB memory of the NVIDIA Titan X GPUs used for this experiment; and versions trained with RBP for 6, 20, and 30 steps. Models were trained to optimize both of these objectives with SGD+Momentum, a learning rate of 5e-2, and batches of 40 images across 24GB NVIDIA GTX GPUs (10 total). |
| Software Dependencies | No | Information insufficient. The paper mentions training with 'Adam [35]' but does not specify software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) with their version numbers, or other relevant libraries and their specific versions. |
| Experiment Setup | Yes | The models were trained with Adam [35] and a learning rate of 3e-4 to minimize average per-pixel cross entropy on batches of 32 images. Training lasted 20 epochs on a dataset of 200,000 images. Models were trained to optimize both of these objectives with SGD+Momentum, a learning rate of 5e-2, and batches of 40 images across 24GB NVIDIA GTX GPUs (10 total). |