Kickback Cuts Backprop’s Red-Tape: Biologically Plausible Credit Assignment in Neural Networks

Authors: David Balduzzi, Hastagiri Vanchinathan, Joachim Buhmann

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we present experiments demonstrating that Kickback matches Backprop s performance on standard benchmark datasets. and Experiments were performed on a 5-layer network with 2 output nodes, 10, 100 and 200 nodes in three hidden layers, and with the input layer directly drawn from the data. and We report normalized mean-squared errors.
Researcher Affiliation Academia David Balduzzi david.balduzzi@vuw.ac.nz Victoria University of Wellington Hastagiri Vanchinathan hastagiri@inf.ethz.ch ETH Zurich Joachim Buhmann jbuhmann@inf.ethz.ch ETH Zurich
Pseudocode Yes Algorithm 1 (Kickback). The truncated feedback ϵj at node j is ϵj := β τj = global error influencej .
Open Source Code No The paper does not provide any links to open-source code for the described methodology or state that the code is publicly available.
Open Datasets Yes We present results on two robotics datasets, SARCOS1 and Barrett WAM2. ... 1Taken from www.gaussianprocess.org/gpml/data/. 2Taken from http://www.ias.tu-darmstadt.de/\Miscellaneous/ Miscellaneous.
Dataset Splits Yes Each SARCOS dataset consists of 44,484 training and 4,449 test points; Barrett split as 12,000 and 3,000. Parameters were tuned via grid-search with 5-fold cross-validation.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. It only mentions that experiments were implemented in Theano.
Software Dependencies No The paper states 'Experiments were implemented in Theano (Bergstra et al. 2010).' but does not provide a specific version number for Theano or any other software dependencies.
Experiment Setup Yes Experiments were performed on a 5-layer network with 2 output nodes, 10, 100 and 200 nodes in three hidden layers, and with the input layer directly drawn from the data. ... Training was performed in batch sizes of 20. ... No pretraining was used. We consider two network initializations. The first is uniform: draw weights uniformly at random from an interval symmetric about 0... The second initialization, signed is taken from Example 1: draw weights uniformly, then change their signs...