Two Tales of Single-Phase Contrastive Hebbian Learning

Authors: Rasmus Høier, Christopher Zach

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our numerical experiments aim to verify two claims: first, in Section 6.1 we demonstrate that choosing different values for α makes a significant difference in the learned weight matrices. Second, Section 6.2 validates that the proposed DP method is efficient enough to enable successful training beyond toy-sized DNNs for arbitrary choices of α 2.
Researcher Affiliation Academia 1Department of Electrical Engineering, Chalmers University of Technology, Sweden. Correspondence to: Rasmus Høier <hier@chalmers.se>.
Pseudocode No The paper describes algorithms and derivations through equations and textual descriptions, but it does not include a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code Yes The code used in our experiments is available at: github. com/Rasmuskh/dualprop_icml_2024
Open Datasets Yes We ran experiments on MNIST and Fashion MNIST using a 784-512( 2)-10 MLP with ReLU activation functions... We also train a 16 layer VGG network using DP with a crossentropy classification loss on the CIFAR10 and CIFAR100 datasets... We restrict the more compute intensive Image Net32x32 experiments to the setting α = 1/2 and β = 0.01.
Dataset Splits Yes 10% of the training data is hold out as validation set for model selection. The hyperparameters are listed in Section B. ... We use 5% of the training data for validation and model selection, and use the public validation dataset to evaluate the selected model.
Hardware Specification No The experiments were enabled by the supercomputing resource Berzelius provided by National Supercomputer Centre at Linköping University and the Knut and Alice Wallenberg foundation.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x).
Experiment Setup Yes We ran experiments on MNIST and Fashion MNIST using a 784-512( 2)-10 MLP with ReLU activation functions, and Table 1 reports test accuracy and Lipschitz estimates... after 20 epochs of ADAM-based training (with learning rate 0.001 and default parameters otherwise). ... We employ standard data augmentation (random crops and horizontal flips) and carry out all experiments with 3 random seeds and report mean and std. deviation. ... The experiments of Section 6.2 were carried out with a VGG16 network and the following hyper parameters: Table 4. Hyper parameters Epochs Learning rate Momentum Weight-decay Batchsize.