Can Forward Gradient Match Backpropagation?

Authors: Louis Fournier, Stephane Rivaud, Eugene Belilovsky, Michael Eickenberg, Edouard Oyallon

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now describe how we implemented our models, training procedure, and the implementations of Gradient Targets and Guesses to study the accuracy of a given model under the variations of those parameters.
Researcher Affiliation Academia 1Sorbonne Universit e, CNRS, ISIR, Paris, France 2CCM, Flatiron Institute, New York, USA 3MILA, Concordia University, Montr eal, Canada.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our source code is available at: github.com/streethagore/Forward Local Gradient.
Open Datasets Yes We considered the CIFAR-10 and Image Net32 datasets, used with standard data augmentation. and Chrabaszcz et al. (2017) has demonstrated that, in general, the conclusions drawn from the Image Net32 dataset are also applicable to the full-resolution Image Net dataset.
Dataset Splits No The paper mentions using 'validation accuracy' and 'cross-validate' but does not provide specific details on the dataset splits (percentages or counts) or cite a specific predefined split that defines these.
Hardware Specification No The paper mentions using 'AI resources of IDRIS' and 'resources from Compute Canada and Calcul Quebec' but does not specify exact hardware models such as GPU or CPU types.
Software Dependencies No The paper does not provide specific software dependencies with version numbers, such as library or framework versions.
Experiment Setup Yes We followed a standard training procedure: SGD with a momentum of 0.9 and weight decay of 5 10 4. For CIFAR-10, we train the model for 100 epochs, with a learning rate decayed by 0.2 every 30 epochs. For Image Net32, we also first try a shorter training of 70 epochs, decaying the learning rate by 0.1 every 20 epochs. The initial learning rate was chosen among {0.05, 0.01, 0.005} for CIFAR-10 and {0.1, 0.05, 0.01, 0.005, 0.0001} for Image Net32.