Equivariance-aware Architectural Optimization of Neural Networks
Authors: Kaitlin Maile, Dennis George Wilson, Patrick Forré
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across a variety of datasets show the benefit of dynamically constrained equivariance to find effective architectures with approximate equivariance. |
| Researcher Affiliation | Academia | Kaitlin Maile IRIT, University of Toulouse kaitlin.maile@irit.fr Dennis G. Wilson ISAE-SUPAERO, University of Toulouse dennis.wilson@isae-supaero.fr Patrick Forr e University of Amsterdam p.d.forre@uva.nl |
| Pseudocode | Yes | Algorithm 1 Evolutionary equivariance-aware neural architecture search. procedure EQUINASE(Initial symmetry group G) and Algorithm 2 Differentiable equivariance-aware neural architecture search. procedure EQUINASD(Set of groups [G]) |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | The Rotated MNIST dataset (Larochelle et al., 2007, Rot MNIST) is a version of the MNIST handwritten digit dataset but with the images rotated by any angle. The Galaxy10 DECals dataset (Leung & Bovy, 2019, Galaxy10) contains galaxy images in 10 broad categories. The ISIC 2019 dataset (Codella et al., 2018; Tschandl et al., 2018; Combalia et al., 2019, ISIC) contains dermascopic images of 8 types of skin cancer plus a null class. |
| Dataset Splits | Yes | For Rot MNIST and MNIST, we use the standard training and test splits with a batch size of 64, reserving 10% of the training data as the validation set. For Galaxy10, we set aside 10% of the dataset as the test set, reserving 10% of the remaining training data as the validation set. For ISIC, we set aside 10% of the available training dataset as the test set, reserving 10% of the remaining data as the validation set and the rest as training data. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper mentions using SGD and Adam optimizers but does not provide specific version numbers for any key software components or libraries (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | The learning rates were selected by grid search over baselines on Rot MNIST. For all experiments in Sections 6.1, we use a simple SGD optimizer with learning rate 0.1 to avoid confounding effects such as momentum during the morphism. For Equi NASE, the parent selection size is 5, the training time per generation is 0.5 epochs, and the number of generations is 50 for all tasks. ... For all experiments in Section 6.2, we use separate Adam optimizers for Ψ and Z, each with a learning rate of 0.01 and otherwise default settings. The total training time is 100 epochs for Rot MNIST and 50 epochs for Galaxy10 and ISIC. |