reproducibilityindex.ai

Improved Generalization of Weight Space Networks via Augmentations

Authors: Aviv Shamsian, Aviv Navon, David W. Zhang, Yan Zhang, Ethan Fetaya, Gal Chechik, Haggai Maron

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on three types of INR datasets: grayscale images (FMNIST), color images (CIFAR10), and 3D shapes (Modelnet40). Our results indicate that data augmentation schemes, and specifically our proposed weight space Mix Up variants, can enhance the accuracy of weight space models by up to 18%, equivalent to using 10 times more training data.
Researcher Affiliation	Collaboration	1Bar-Ilan University 2University of Amsterdam 3Samsung SAIT AI Lab, Montreal 4NVIDIA Research 5Technion.
Pseudocode	No	The paper describes methods and equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	To support future research and the reproducibility of our results, we made our source code and datasets publicly available at: https://github.com/AvivSham/deep-weight-space-augmentations.
Open Datasets	Yes	To address this issue, we present new INR classification benchmarks based on Model Net40 (Wu et al., 2015), Fashion-MNIST (Xiao et al., 2017), and CIFAR10 (Krizhevsky et al., 2009) datasets.
Dataset Splits	Yes	We split the INRs dataset into train, validation, and test sets of sizes 55K, 5K, and 10K respectively. Additionally, we utilize the validation set for early stopping, i.e. selecting the best model w.r.t validation accuracy.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper mentions specific optimizers like Adam W (Loshchilov & Hutter, 2017) and uses the SIREN (Sitzmann et al., 2020) architecture, but it does not list general software dependencies (e.g., Python, PyTorch, CUDA) with specific version numbers.
Experiment Setup	Yes	In all experiments, we use DWS (Navon et al., 2023b) network with 4 hidden layers and hidden dimension of 128. We optimized the network using a 5e 3 learning rate with Adam W (Loshchilov & Hutter, 2017) optimizer. For the GNN, we use the version of Relation Transformer presented in (Zhang et al., 2023) with 4 hidden layers, node dimension of 64, and edge dimension of 32. We optimized the network using a 1e 3 learning rate with Adam W (Loshchilov & Hutter, 2017) optimizer and a 1000 steps warmup schedule. We optimized the weight space architecture for 250 epochs for the Model Net40, and 300 epochs for the FMNIST and CIFAR10 INRs datasets. Additionally, we utilize the validation set for early stopping, i.e. selecting the best model w.r.t validation accuracy.