Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures
Authors: Sergey Bartunov, Adam Santoro, Blake Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present results on the MNIST, CIFAR-10, and Image Net datasets, explore variants of target-propagation (TP) and feedback alignment (FA) algorithms, and examine performance in both fully- and locally-connected architectures. |
| Researcher Affiliation | Collaboration | Sergey Bartunov Deep Mind Adam Santoro Deep Mind Blake A. Richards University of Toronto Luke Marris Deep Mind Geoffrey E. Hinton Google Brain Timothy P. Lillicrap Deep Mind, University College London |
| Pseudocode | Yes | Algorithm 1 Simplified Difference Target Propagation |
| Open Source Code | No | The paper includes a footnote linking to a GitHub repository (https://github.com/donghyunlee/dtp/blob/master/conti_dtp.py) but explicitly states this is for the 'original implementation of DTP' by Lee et al. [21], not the authors' own code for the work presented in this paper. No statement is made about the availability of the authors' source code. |
| Open Datasets | Yes | We present results on the MNIST, CIFAR-10, and Image Net datasets... MNIST dataset, consisting of 28 28 gray-scale images of hand-drawn digits. CIFAR-10 is a more challenging dataset introduced by Krizhevsky [17]. Image Net dataset [33], a large-scale benchmark... |
| Dataset Splits | No | The paper mentions 'training set' and 'test set' but does not explicitly provide details about specific splits (percentages, sample counts) for training, validation, or testing, nor does it explicitly mention the use of a distinct 'validation' set or its split. |
| Hardware Specification | Yes | on a GPU with 16GB of onboard memory, we encountered out-of-memory errors when trying to initialize and train these networks using a Tensorflow implementation. |
| Software Dependencies | No | The paper mentions 'Tensorflow implementation' but does not specify a version number for TensorFlow or any other software dependencies with version numbers. |
| Experiment Setup | Yes | For optimization we use Adam [15], with different hyper-parameters for forward and inverse models in the case of target propagation. All layers are initialized using the method suggested by Glorot & Bengio [10]. In all networks we used the hyperbolic tangent as a nonlinearity between layers... Then we fixed these architectures for BP and FA variants and ran independent hyperparameter searches for each learning method. Finally, we report best errors achieved in 500 epochs. For additional details see Tables 3 and 4 in the Appendix. |