Task-Driven Convolutional Recurrent Models of the Visual System
Authors: Aran Nayebi, Daniel Bear, Jonas Kubilius, Kohitij Kar, Surya Ganguli, David Sussillo, James J. DiCarlo, Daniel L. Yamins
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We found that standard forms of recurrence (vanilla RNNs and LSTMs) do not perform well within deep CNNs on the Image Net task. In contrast, novel cells that incorporated two structural features, bypassing and gating, were able to boost task accuracy substantially. |
| Researcher Affiliation | Collaboration | 1Neurosciences Ph D Program, Stanford University, Stanford, CA 94305 2Department of Psychology, Stanford University, Stanford, CA 94305 3Department of Computer Science, Stanford University, Stanford, CA 94305 4Department of Applied Physics, Stanford University, Stanford, CA 94305 5Mc Govern Institute for Brain Research, MIT, Cambridge, MA 02139 6Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139 7Brain and Cognition, KU Leuven, Leuven, Belgium 8Google Brain, Google, Inc., Mountain View, CA 94043 9Wu Tsai Neurosciences Institute, Stanford, CA 94305 |
| Pseudocode | No | The paper describes the architecture and update rules for the network in text and mathematical formulas, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | This choice required an unrolling scheme different from that used in the standard Tensorflow RNN library, the code for which (and for all of our models) can be found at https://github.com/neuroailab/tnn. |
| Open Datasets | Yes | CNNs trained to recognize objects in the Image Net dataset predict the time-averaged neural responses of cortical neurons better than any other model class. |
| Dataset Splits | Yes | This median model reached a final Top1 Image Net accuracy nearly equal to a Res Net-34 model with nearly twice as many layers, even though the Conv RNN used only 75% as many parameters (Res Net-34, 21.8 M parameters, 73.1% Validation Top1; Median Conv RNN, 15.5 M parameters, 72.9% Validation Top1). |
| Hardware Specification | Yes | To test whether the LSTM models underperformed for this reason, we searched over training hyperparameters and common structural variants of the LSTM to better adapt this local structure to deep convolutional networks, using hundreds of second generation Google Cloud Tensor Processing Units (TPUv2s). |
| Software Dependencies | No | The paper mentions using the "Tensorflow library" but does not specify a version number or provide version details for any other software dependencies. |
| Experiment Setup | Yes | We searched over learning hyperparameters (e.g. gradient clip values, learning rate) as well as structural hyperparameters (e.g. gate convolution filter sizes, channel depth, whether or not to use peephole connections, etc.). |