Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups

Authors: David M. Knigge, David W Romero, Erik J Bekkers

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach across several vision datasets, and show that our weight sharing leads to improved performance and computational efficiency. In many settings, separable G-CNNs outperform their nonseparable counterpart, while only using a fraction of their training time. In addition, thanks to the increase in computational efficiency, we are able to implement G-CNNs equivariant to the Sim(2) group; the group of dilations, rotations and translations of the plane. Sim(2)-equivariance further improves performance on all tasks considered, and achieves state-of-the-art performance on rotated MNIST.
Researcher Affiliation Academia 1University of Amsterdam, The Netherlands 2Vrije Universiteit Amsterdam, The Netherlands. Correspondence to: David M. Knigge <d.m.knigge@uva.nl>.
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. It provides mathematical derivations and conceptual diagrams.
Open Source Code Yes Code is available on Github.
Open Datasets Yes Rotated MNIST The 62.000 MNIST images (Le Cun et al., 1998) are split into a training, validation and test set of 10.000, 2.000 and 50.000 images respectively, and randomly rotated to orientations between [0, 2π). ... CIFAR10 We evaluate our models on the CIFAR10 dataset, containing 62.000 32 32 color images in 10 balanced classes (Krizhevsky et al., 2009).
Dataset Splits Yes Rotated MNIST The 62.000 MNIST images (Le Cun et al., 1998) are split into a training, validation and test set of 10.000, 2.000 and 50.000 images respectively, and randomly rotated to orientations between [0, 2π).
Hardware Specification Yes All models are trained on a single Titan V.
Software Dependencies No The paper mentions 'Pytorch conv2d' but does not specify version numbers for PyTorch or any other software dependencies, which are necessary for full reproducibility.
Experiment Setup Yes All architectures are trained with Adam optimisation (Kingma & Ba, 2014), and 1e 4 weight decay. All models trained on rotated MNIST, except for the state-of-the-art runs detailed in B.2, are trained for 200 epochs with a batch size of 128 and a learning rate of 1 10 4. ... For the SIREN, we used an architecture of two hidden layers of 64 units. We found a value for ω0 of 10 to work well in all our experiments.