Equivariance Through Parameter-Sharing

Authors: Siamak Ravanbakhsh, Jeff Schneider, Barnabás Póczos

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical The novelty of this work is its focus on the model symmetry as a gateway to equivariance. This gives us new theoretical guarantees for a strict notion of equivariance in neural networks. The core idea is simple: consider a colored bipartite graph Ωrepresenting a neural network layer. Edges of the same color represent tied parameters. This neural network layer as a function is equivariant to the actions of a given group G (and nothing more) iff the action of G is the symmetry group of Ω i.e., there is a simple bijection between parameter symmetries and equivariences of the corresponding neural network. The problem then boils down to designing colored bipartite graphs with given symmetries, which constitutes a major part of this paper.
Researcher Affiliation Academia 1School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA, USA 15217. Correspondence to: Siamak Ravanbakhsh <mravanba@cs.cmu.edu>.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No The paper is theoretical and does not involve empirical experiments with datasets, so no dataset access information for training is provided.
Dataset Splits No The paper is theoretical and does not describe empirical experiments with datasets. Therefore, no dataset split information (e.g., training, validation, test splits) is provided.
Hardware Specification No The paper is theoretical and does not describe any empirical experiments or the hardware used to run them.
Software Dependencies No The paper is theoretical and does not describe any empirical experiments or software dependencies with specific version numbers.
Experiment Setup No The paper is theoretical and does not describe any empirical experimental setup details, such as hyperparameters or training configurations.