Two Routes to Scalable Credit Assignment without Weight Symmetry

Authors: Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena, Surya Ganguli, Jonathan Bloom, Daniel Yamins

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we investigate a recently proposed local learning rule that yields competitive performance with backpropagation and find that it is highly sensitive to metaparameter choices, requiring laborious tuning that does not transfer across network architecture. Our analysis indicates the underlying mathematical reason for this instability, allowing us to identify a more robust local learning rule that better transfers without metaparameter tuning. Nonetheless, we find a performance and stability gap between this local rule and backpropagation that widens with increasing model depth. We then investigate several non-local learning rules that relax the need for instantaneous weight transport into a more biologically-plausible weight estimation process, showing that these rules match state-ofthe-art performance on deep networks and operate effectively in the presence of noisy updates.
Researcher Affiliation Collaboration 1Institute for Computational and Mathematical Engineering, Stanford University 2Neurosciences Ph D Program, Stanford University 3Department of Applied Physics, Stanford University 4Broad Institute of MIT and Harvard, Cambridge, MA 5Cellarity, Cambridge, MA 6Department of Psychology, Stanford University 7Department of Computer Science, Stanford University 8Wu Tsai Neurosciences Institute, Stanford University.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes To aid in this exploration going forward, we have written an open-source Tensor Flow library4, allowing others to train arbitrary network architectures and learning rules at scale, distributed across multiple GPU or TPU workers. 4https://github.com/neuroailab/neuralalignment
Open Datasets Yes Image Net categorization with deep convolutional networks.
Dataset Splits Yes Image Net top-1 validation accuracy and jointly optimizing these parameters for Image Net validation set performance using a Bayesian Tree-structured Parzen Estimator (TPE) (Bergstra et al., 2011). This search optimized for Top-1 Image Net validation performance with the Res Net-18 architecture, comprising a total of 628 distinct settings.
Hardware Specification Yes distributed across multiple GPU or TPU workers. We thank the Google Tensor Flow Research Cloud (TFRC) team for providing TPU resources for this project.
Software Dependencies No The paper mentions using an 'open-source Tensor Flow library' but does not provide specific version numbers for TensorFlow or any other software dependencies.
Experiment Setup Yes Here, we investigate a recently proposed local learning rule that yields competitive performance with backpropagation and find that it is highly sensitive to metaparameter choices, requiring laborious tuning that does not transfer across network architecture. Our analysis indicates the underlying mathematical reason for this instability, allowing us to identify a more robust local learning rule that better transfers without metaparameter tuning. Nonetheless, we find a performance and stability gap between this local rule and backpropagation that widens with increasing model depth. We then investigate several non-local learning rules that relax the need for instantaneous weight transport into a more biologically-plausible weight estimation process, showing that these rules match state-ofthe-art performance on deep networks and operate effectively in the presence of noisy updates. We thus performed a large-scale metaparameter search over the continuous , β, and the standard deviation σ of the Gaussian input noise, jointly optimizing these parameters for Image Net validation set performance using a Bayesian Tree-structured Parzen Estimator (TPE) (Bergstra et al., 2011).