Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures

Authors: Sergey Bartunov, Adam Santoro, Blake Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present results on the MNIST, CIFAR-10, and Image Net datasets, explore variants of target-propagation (TP) and feedback alignment (FA) algorithms, and examine performance in both fully- and locally-connected architectures.
Researcher Affiliation Collaboration Sergey Bartunov Deep Mind Adam Santoro Deep Mind Blake A. Richards University of Toronto Luke Marris Deep Mind Geoffrey E. Hinton Google Brain Timothy P. Lillicrap Deep Mind, University College London
Pseudocode Yes Algorithm 1 Simplified Difference Target Propagation
Open Source Code No The paper includes a footnote linking to a GitHub repository (https://github.com/donghyunlee/dtp/blob/master/conti_dtp.py) but explicitly states this is for the 'original implementation of DTP' by Lee et al. [21], not the authors' own code for the work presented in this paper. No statement is made about the availability of the authors' source code.
Open Datasets Yes We present results on the MNIST, CIFAR-10, and Image Net datasets... MNIST dataset, consisting of 28 28 gray-scale images of hand-drawn digits. CIFAR-10 is a more challenging dataset introduced by Krizhevsky [17]. Image Net dataset [33], a large-scale benchmark...
Dataset Splits No The paper mentions 'training set' and 'test set' but does not explicitly provide details about specific splits (percentages, sample counts) for training, validation, or testing, nor does it explicitly mention the use of a distinct 'validation' set or its split.
Hardware Specification Yes on a GPU with 16GB of onboard memory, we encountered out-of-memory errors when trying to initialize and train these networks using a Tensorflow implementation.
Software Dependencies No The paper mentions 'Tensorflow implementation' but does not specify a version number for TensorFlow or any other software dependencies with version numbers.
Experiment Setup Yes For optimization we use Adam [15], with different hyper-parameters for forward and inverse models in the case of target propagation. All layers are initialized using the method suggested by Glorot & Bengio [10]. In all networks we used the hyperbolic tangent as a nonlinearity between layers... Then we fixed these architectures for BP and FA variants and ran independent hyperparameter searches for each learning method. Finally, we report best errors achieved in 500 epochs. For additional details see Tables 3 and 4 in the Appendix.