reproducibilityindex.ai

Equivariant Deep Weight Space Alignment

Authors: Aviv Navon, Aviv Shamsian, Ethan Fetaya, Gal Chechik, Nadav Dym, Haggai Maron

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results indicate that a feed-forward pass with DEEP-ALIGN produces better or equivalent alignments compared to those produced by current optimization algorithms. Additionally, our alignments can be used as an effective initialization for other methods, leading to improved solutions with a significant speedup in convergence.
Researcher Affiliation	Collaboration	1Bar-Ilan University 2NVIDIA Research 3Technion.
Pseudocode	No	The paper describes the architecture and its components textually and with a diagram (Figure 2), but does not provide structured pseudocode or an algorithm block.
Open Source Code	Yes	To support future research and the reproducibility of our results, we made our source code and datasets publicly available at: https: //github.com/Aviv Navon/deep-align.
Open Datasets	Yes	We use six network datasets. Two datasets consist of MLP classifiers for MNIST and CIFAR10, and four datasets consist of CNN classifiers trained using CIFAR10 and STL10.
Dataset Splits	Yes	When using image datasets, we use the standard train-test split and allocate 10% of the training data for validation.
Hardware Specification	Yes	We compare DEEP-ALIGN to baselines by measuring the time required to align a pair of models in the CIFAR10 CNN and VGG classifiers datasets, and report the averaged alignment time using 1000 random pairs on a single A100 Nvidia GPU.
Software Dependencies	No	The paper mentions using the Adam W optimizer and various datasets but does not specify versions for programming languages, libraries, or other software dependencies.
Experiment Setup	Yes	We use a 4-hidden layer DEEP-ALIGN network with a hidden dimension of 64 and an output dimension of 128 from the FDW S block. We optimize our method with a learning rate of 5e 4 using the Adam W (Loshchilov & Hutter, 2017) optimizer.