A Theoretical Framework for Target Propagation

Authors: Alexander Meulemans, Francesco Carzaniga, Johan Suykens, João Sacramento, Benjamin F. Grewe

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theory is corroborated by experimental results that show significant improvements in performance and in the alignment of forward weight updates with loss gradients, compared to DTP.
Researcher Affiliation Academia 1Institute of Neuroinformatics, University of Zürich and ETH Zürich 2ESAT-STADIUS, KU Leuven
Pseudocode No The paper states 'algorithms available in SM' (supplementary materials) in Section 4. Pseudocode or algorithm blocks are not included in the main text of the paper.
Open Source Code Yes Py Torch implementation of all methods is available on github.com/meulemansalex/theoretical_ framework_for_target_propagation
Open Datasets Yes We evaluate the new DTP variants on a set of standard image classification datasets: MNIST (Le Cun, 1998), Fashion-MNIST (Xiao et al., 2017) and CIFAR10 (Krizhevsky et al., 2014).
Dataset Splits Yes For the hyperparameter searches, we used a validation set of 5000 samples from the training set for all datasets.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies No The paper mentions 'Py Torch implementation' and that models were 'optimized by Adam (Kingma and Ba, 2014)', but it does not specify version numbers for PyTorch or any other software libraries or dependencies.
Experiment Setup Yes We used fully connected networks with tanh nonlinearities, with a softmax output layer and cross-entropy loss, optimized by Adam (Kingma and Ba, 2014) [...] We used targets to train all layers in DTP and its variants [...] We report the test errors corresponding to the epoch with the best validation errors [...] (5x256 fully connected (FC) hidden layers for MNIST and Fashion-MNIST, 3x FC1024 for CIFAR10) [...] a small CNN with tanh nonlinearities (Conv5x5x32; Maxpool3x3; Conv5x5x64; Maxpool3x3; FC512; FC10) [...] training of 100 epochs.