Biologically-Plausible Learning Algorithms Can Scale to Large Datasets

Authors: Will Xiao, Honglin Chen, Qianli Liao, Tomaso Poggio

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we additionally evaluate the sign-symmetry (SS) algorithm (Liao et al., 2016b)... We examined the performance of sign-symmetry and feedback alignment on Image Net and MS COCO datasets using different network architectures (Res Net-18 and Alex Net for Image Net; Retina Net for MS COCO).
Researcher Affiliation Academia Will Xiao Department of Molecular and Cellular Biology Harvard University Cambridge, MA 02138, USA xiaow@fas.harvard.edu Honglin Chen Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095, USA chenhonglin@g.ucla.edu Qianli Liao, Tomaso Poggio Center for Brains, Minds and Machines Massachusetts Institute of Technology Cambridge, MA 02139, USA lql@mit.edu, tp@csail.mit.edu
Pseudocode No The paper describes equations (Equation 1 and 2) for computation but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes We implement both algorithms in Py Torch for convolutional and fully-connected layers and post the code at https://github.com/willwx/sign-symmetry.
Open Datasets Yes We examined the performance of sign-symmetry and feedback alignment on Image Net and MS COCO datasets using different network architectures (Res Net-18 and Alex Net for Image Net; Retina Net for MS COCO).
Dataset Splits Yes Figure 1: a, Top-1 and b, top-5 validation error on Image Net for Res Net-18 and Alex Net trained with different learning algorithms. Dashed lines, Res Net-18 reference performance (Johnson et al., 2016).
Hardware Specification Yes We gratefully acknowledge the support of NVIDIA Corporation with the donation of the DGX-1 used for this research.
Software Dependencies No The paper mentions 'Py Torch' as the implementation framework but does not specify its version number or any other software dependencies with version details.
Experiment Setup Yes For backpropagation, standard training parameters were used (SGD with learning rate 0.1, momentum 0.9, and weight decay 10 4). For Res Net-18 with other learning algorithms, we used SGD with learning rate 0.053, while momentum and weight decay remain unchanged. For Alex Net with all learning algorithms, standard training parameters were used (SGD with learning rate 0.01, momentum 0.9, and weight decay 5 10 4). We used a version of Alex Net (Krizhevsky, 2014, as used in torchvision) which we slightly modified to add batch normalization (Ioffe & Szegedy, 2015) before every nonlinearity and consequently removed dropout. For all experiments, we used a batch size of 256, a learning rate decay of 10-fold every 10 epochs, and trained for 50 epochs.