A Theoretical Framework for Target Propagation
Authors: Alexander Meulemans, Francesco Carzaniga, Johan Suykens, João Sacramento, Benjamin F. Grewe
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theory is corroborated by experimental results that show significant improvements in performance and in the alignment of forward weight updates with loss gradients, compared to DTP. |
| Researcher Affiliation | Academia | 1Institute of Neuroinformatics, University of Zürich and ETH Zürich 2ESAT-STADIUS, KU Leuven |
| Pseudocode | No | The paper states 'algorithms available in SM' (supplementary materials) in Section 4. Pseudocode or algorithm blocks are not included in the main text of the paper. |
| Open Source Code | Yes | Py Torch implementation of all methods is available on github.com/meulemansalex/theoretical_ framework_for_target_propagation |
| Open Datasets | Yes | We evaluate the new DTP variants on a set of standard image classification datasets: MNIST (Le Cun, 1998), Fashion-MNIST (Xiao et al., 2017) and CIFAR10 (Krizhevsky et al., 2014). |
| Dataset Splits | Yes | For the hyperparameter searches, we used a validation set of 5000 samples from the training set for all datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions 'Py Torch implementation' and that models were 'optimized by Adam (Kingma and Ba, 2014)', but it does not specify version numbers for PyTorch or any other software libraries or dependencies. |
| Experiment Setup | Yes | We used fully connected networks with tanh nonlinearities, with a softmax output layer and cross-entropy loss, optimized by Adam (Kingma and Ba, 2014) [...] We used targets to train all layers in DTP and its variants [...] We report the test errors corresponding to the epoch with the best validation errors [...] (5x256 fully connected (FC) hidden layers for MNIST and Fashion-MNIST, 3x FC1024 for CIFAR10) [...] a small CNN with tanh nonlinearities (Conv5x5x32; Maxpool3x3; Conv5x5x64; Maxpool3x3; FC512; FC10) [...] training of 100 epochs. |