The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
Authors: Derek Lim, Theo Putterman, Robin Walters, Haggai Maron, Stefanie Jegelka
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a comprehensive experimental study consisting of multiple tasks aimed at assessing the effect of removing parameter symmetries. |
| Researcher Affiliation | Collaboration | Derek Lim MIT CSAIL dereklim@mit.edu Theo (Moe) Putterman UC Berkeley moeputterman@berkeley.edu Robin Walters Northeastern University Haggai Maron Technion, NVIDIA Stefanie Jegelka TU Munich, MIT |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Our code is available at https://github.com/cptq/asymmetric-networks. |
| Open Datasets | Yes | The datasets we use are MNIST [38], CIFAR-10 [33], CIFAR-100 [33], and ogbn-ar Xiv [27], which are all widely used in machine learning research. |
| Dataset Splits | Yes | On the other hand, σ-Asym GNNs have 70.8%/70.1% train/validation accuracy, while W-Asym GNNs have 70.7%/70.06% train/validation accuracy. |
| Hardware Specification | Yes | training all 20,000 classifiers takes just under 400 GPU hours (about 2 GPU-weeks) on NVIDIA RTX 2080 Ti GPUs. |
| Software Dependencies | No | No specific version numbers for software packages were provided. The paper mentions 'Py Torch', 'FFCV', and 'Py Torch Geometric' but without version details. |
| Experiment Setup | Yes | We use a batch size of 128 and a learning rate that warms up from .0001 to .01 over 20 epochs. In the width 8 multiplier case we train for 50 epochs, and in the width 1 multiplier case we train for 100. For σ-Asymmetric Res Nets, we warm up to a learning rate of .001 instead of .01 due to training instability. |