Two Tales of Single-Phase Contrastive Hebbian Learning
Authors: Rasmus Høier, Christopher Zach
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical experiments aim to verify two claims: first, in Section 6.1 we demonstrate that choosing different values for α makes a significant difference in the learned weight matrices. Second, Section 6.2 validates that the proposed DP method is efficient enough to enable successful training beyond toy-sized DNNs for arbitrary choices of α 2. |
| Researcher Affiliation | Academia | 1Department of Electrical Engineering, Chalmers University of Technology, Sweden. Correspondence to: Rasmus Høier <hier@chalmers.se>. |
| Pseudocode | No | The paper describes algorithms and derivations through equations and textual descriptions, but it does not include a formally labeled "Pseudocode" or "Algorithm" block. |
| Open Source Code | Yes | The code used in our experiments is available at: github. com/Rasmuskh/dualprop_icml_2024 |
| Open Datasets | Yes | We ran experiments on MNIST and Fashion MNIST using a 784-512( 2)-10 MLP with ReLU activation functions... We also train a 16 layer VGG network using DP with a crossentropy classification loss on the CIFAR10 and CIFAR100 datasets... We restrict the more compute intensive Image Net32x32 experiments to the setting α = 1/2 and β = 0.01. |
| Dataset Splits | Yes | 10% of the training data is hold out as validation set for model selection. The hyperparameters are listed in Section B. ... We use 5% of the training data for validation and model selection, and use the public validation dataset to evaluate the selected model. |
| Hardware Specification | No | The experiments were enabled by the supercomputing resource Berzelius provided by National Supercomputer Centre at Linköping University and the Knut and Alice Wallenberg foundation. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x). |
| Experiment Setup | Yes | We ran experiments on MNIST and Fashion MNIST using a 784-512( 2)-10 MLP with ReLU activation functions, and Table 1 reports test accuracy and Lipschitz estimates... after 20 epochs of ADAM-based training (with learning rate 0.001 and default parameters otherwise). ... We employ standard data augmentation (random crops and horizontal flips) and carry out all experiments with 3 random seeds and report mean and std. deviation. ... The experiments of Section 6.2 were carried out with a VGG16 network and the following hyper parameters: Table 4. Hyper parameters Epochs Learning rate Momentum Weight-decay Batchsize. |