Align, then memorise: the dynamics of learning with feedback alignment
Authors: Maria Refinetti, Stéphane D’Ascoli, Ruben Ohana, Sebastian Goldt
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments on MNIST and CIFAR10 clearly demonstrate degeneracy breaking in deep non-linear networks and show that the align-then-memorize process occurs sequentially from the bottom layers of the network to the top. |
| Researcher Affiliation | Collaboration | 1Department of Physics, Ecole Normale Sup erieure, Paris, France 2Ide PHICS laboratory, EPFL 3Facebook AI Research, Paris, France 4Light On, Paris, France 5International School of Advanced Studies (SISSA), Trieste, Italy. |
| Pseudocode | No | The paper does not include pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Reproducibility We host all the code to reproduce our experiments online at https://github.com/ sdascoli/dfa-dynamics. |
| Open Datasets | Yes | Numerical experiments on MNIST and CIFAR10 clearly demonstrate degeneracy breaking in deep non-linear networks and show that the align-then-memorize process occurs sequentially from the bottom layers of the network to the top. |
| Dataset Splits | No | The paper does not explicitly specify dataset splits (e.g., percentages for train/validation/test sets) or how these splits were obtained. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We explore to what extent degeneracy breaking occurs in deep nonlinear networks by training 4-layer multi-layer perceptrons (MLPs) with 100 nodes per layer for 1000 epochs with both BP and DFA, on the MNIST and CIFAR10 datasets, with Tanh and Re LU nonlinearities (cf. App. E.2 for further experimental details). Parameters: N = 500, L = 2, M = 2, η = 0.1, σ0 = 10 2. |