Implicit Regularization in Feedback Alignment Learning Mechanisms for Neural Networks
Authors: Zachary Robertson, Sanmi Koyejo
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we empirically evaluate feedback alignment (FA) mechanisms in neural networks and their performance in multi-class classification tasks. We test two hypotheses: 1. The conservation of learning dynamics as per Theorem 5.2, Lemma 5.4, and Lemma 5.3 hold under practical training conditions. 2. Alignment with the true gradient enhances multi-class classification performance. Our theory makes direct contact with experiments, confirming key quantitative predictions and consistency with previous results. |
| Researcher Affiliation | Academia | Zachary Robertson 1 Oluwasanmi Koyejo 1 1Department of Computer Science, Stanford, California, United States. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'We use a pre-existing python package for FA implementations (Sanfiz & Akrout, 2021)' but does not provide a link to their own source code or state that their code is being released. |
| Open Datasets | Yes | Our experiments involve widely-used datasets MNIST, CIFAR-100, Tiny Image Net (Le Cun, 1998; Krizhevsky et al., 2009; Le & Yang, 2015). |
| Dataset Splits | No | The paper mentions using subsets of standard datasets (e.g., 'random 4k subset of the MNIST dataset, with 20% label noise' and 'randomly sub-sample n-class subsets from these datasets') but does not specify exact train/validation/test split percentages or methodology for these specific subsets that would allow exact reproduction of the data partitioning. |
| Hardware Specification | Yes | Training is conducted on two A100 GPUs. |
| Software Dependencies | No | The paper mentions 'We use a pre-existing python package for FA implementations (Sanfiz & Akrout, 2021)' but does not provide specific version numbers for this or any other software dependencies (e.g., Python, PyTorch/TensorFlow versions). |
| Experiment Setup | Yes | Our protocol involves a 6,000-epoch training schedule with adaptive learning rates. We have the initial learning rate set to 0.05, no weight decay, and a momentum of 0.05. The learning rate was scheduled to decrease by a factor of ten every 1,000 epochs. We use a batch size of 1024, and train for 500 epochs. The initial learning rate is 0.01, weight decay is 0.0001, and momentum is 0.9. The learning rate was scheduled to decrease by half at the 100th and 250th epochs. |