The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof

Authors: Derek Lim, Theo Putterman, Robin Walters, Haggai Maron, Stefanie Jegelka

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a comprehensive experimental study consisting of multiple tasks aimed at assessing the effect of removing parameter symmetries.
Researcher Affiliation Collaboration Derek Lim MIT CSAIL dereklim@mit.edu Theo (Moe) Putterman UC Berkeley moeputterman@berkeley.edu Robin Walters Northeastern University Haggai Maron Technion, NVIDIA Stefanie Jegelka TU Munich, MIT
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Our code is available at https://github.com/cptq/asymmetric-networks.
Open Datasets Yes The datasets we use are MNIST [38], CIFAR-10 [33], CIFAR-100 [33], and ogbn-ar Xiv [27], which are all widely used in machine learning research.
Dataset Splits Yes On the other hand, σ-Asym GNNs have 70.8%/70.1% train/validation accuracy, while W-Asym GNNs have 70.7%/70.06% train/validation accuracy.
Hardware Specification Yes training all 20,000 classifiers takes just under 400 GPU hours (about 2 GPU-weeks) on NVIDIA RTX 2080 Ti GPUs.
Software Dependencies No No specific version numbers for software packages were provided. The paper mentions 'Py Torch', 'FFCV', and 'Py Torch Geometric' but without version details.
Experiment Setup Yes We use a batch size of 128 and a learning rate that warms up from .0001 to .01 over 20 epochs. In the width 8 multiplier case we train for 50 epochs, and in the width 1 multiplier case we train for 100. For σ-Asymmetric Res Nets, we warm up to a learning rate of .001 instead of .01 due to training instability.