Bispectral Neural Networks

Authors: Sophia Sanborn, Christian A Shewmake, Bruno Olshausen, Christopher J. Hillar

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now test the capacity of Bispectral Networks to learn a group by learning to separate orbit classes. We conduct four experiments to analyze the properties of the architecture: learning Fourier transforms on groups (Section 4.1), group-invariant classification (Section 4.2), model completeness (Section 4.3), and extracting the Cayley table (Section 4.4).
Researcher Affiliation Collaboration 1Redwood Center for Theoretical Neuroscience 2University of California, Berkeley 3Awecom, Inc.
Pseudocode Yes Below we list the algorithms used to recover Cayley tables from a group’s irreducible representations. Algorithm 1 GETCAYLEYFROMIRREPS
Open Source Code Yes The code to implement all models and experiments in this paper can be found at github.com/sophiaas/bispectral-networks.
Open Datasets Yes Models are trained on datasets consisting of 100 randomly sampled (16, 16) natural image patches from the van Hateren dataset [27] that have been transformed by the two groups to generate orbit classes.
Dataset Splits Yes A random 20% of each dataset is set aside for model validation and is used to tune hyperparameters.
Hardware Specification No The paper does not specify any particular hardware used for training or inference, such as GPU or CPU models, or details about compute resources.
Software Dependencies No All networks were implemented and trained in PyTorch [37]. We additionally made use of the open-source Cplx Module library [38], which provides implementations of the various complex-valued operations and initializations used in this work. We also used the Pytorch Metric Learning library [40]. (This mentions libraries but lacks explicit version numbers for PyTorch and Pytorch Metric Learning, which are key components.)
Experiment Setup Yes Weights were initialized using the complex orthogonal weight initialization method proposed in [39]. Each parameter vector in W was normalized to unit length after every gradient step. We used a batch sampler from the Pytorch Metric Learning library [40] to load batches with M random examples per class, with M = 10 and a batch size of 100. Networks were trained with the Adam optimizer [41] until convergence, using an initial learning rate of 0.002 and a cyclic learning rate scheduler [42], with 0.0001 and 0.005 as the lower and upper bounds, respectively, of the cycle.