Implicit Regularization in Feedback Alignment Learning Mechanisms for Neural Networks

Authors: Zachary Robertson, Sanmi Koyejo

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we empirically evaluate feedback alignment (FA) mechanisms in neural networks and their performance in multi-class classification tasks. We test two hypotheses: 1. The conservation of learning dynamics as per Theorem 5.2, Lemma 5.4, and Lemma 5.3 hold under practical training conditions. 2. Alignment with the true gradient enhances multi-class classification performance. Our theory makes direct contact with experiments, confirming key quantitative predictions and consistency with previous results.
Researcher Affiliation Academia Zachary Robertson 1 Oluwasanmi Koyejo 1 1Department of Computer Science, Stanford, California, United States.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper states 'We use a pre-existing python package for FA implementations (Sanfiz & Akrout, 2021)' but does not provide a link to their own source code or state that their code is being released.
Open Datasets Yes Our experiments involve widely-used datasets MNIST, CIFAR-100, Tiny Image Net (Le Cun, 1998; Krizhevsky et al., 2009; Le & Yang, 2015).
Dataset Splits No The paper mentions using subsets of standard datasets (e.g., 'random 4k subset of the MNIST dataset, with 20% label noise' and 'randomly sub-sample n-class subsets from these datasets') but does not specify exact train/validation/test split percentages or methodology for these specific subsets that would allow exact reproduction of the data partitioning.
Hardware Specification Yes Training is conducted on two A100 GPUs.
Software Dependencies No The paper mentions 'We use a pre-existing python package for FA implementations (Sanfiz & Akrout, 2021)' but does not provide specific version numbers for this or any other software dependencies (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes Our protocol involves a 6,000-epoch training schedule with adaptive learning rates. We have the initial learning rate set to 0.05, no weight decay, and a momentum of 0.05. The learning rate was scheduled to decrease by a factor of ten every 1,000 epochs. We use a batch size of 1024, and train for 500 epochs. The initial learning rate is 0.01, weight decay is 0.0001, and momentum is 0.9. The learning rate was scheduled to decrease by half at the 100th and 250th epochs.