Deep Symmetry Networks
Authors: Robert Gens, Pedro M Domingos
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on NORB and MNIST-rot show that symnets over the affine group greatly reduce sample complexity relative to convnets by better capturing the symmetries in the data. In this paper, we introduce deep symmetry networks (symnets), a generalization of convnets that forms feature maps over arbitrary symmetry groups. Symnets use kernel-based interpolation to tractably tie parameters and pool over symmetry spaces of any dimension. Like convnets, they are trained with backpropagation. The composition of feature transformations through the layers of a symnet provides a new approach to deep learning. Experiments on NORB and MNIST-rot show that symnets over the affine group greatly reduce sample complexity relative to convnets by better capturing the symmetries in the data. |
| Researcher Affiliation | Academia | Robert Gens Pedro Domingos Department of Computer Science and Engineering University of Washington Seattle, WA 98195-2350, U.S.A. {rcg,pedrod}@cs.washington.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | 6.1 MNIST-rot [15] consists of 28x28 pixel greyscale images: 104 for training, 2 103 for validation, and 5 104 for testing. The images are sampled from the MNIST digit recognition dataset and each is rotated by a random angle in the uniform distribution [0, 2π]. |
| Dataset Splits | Yes | 6.1 MNIST-rot consists of 28x28 pixel greyscale images: 104 for training, 2 103 for validation, and 5 104 for testing. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments are provided in the paper. Only general funding information is mentioned. |
| Software Dependencies | No | We modified the Theano [5] implementation of convolutional networks so that the network consisted of a single layer of convolution and maxpooling followed by a hidden layer of 500 units and then softmax classification. The version number for Theano is not provided. |
| Experiment Setup | Yes | Both networks were trained with 50 epochs of mini-batch gradient descent with momentum, and test results are reported on the network with lowest error on the validation set2. The convnet did best with small 5 × 5 filters and the symnet with large 20 × 20 filters. 2Grid search over learning rate {.1, .2}, mini-batch size {10, 50, 100}, filter size {5, 10, 15, 20, 25}, number of filters {20, 50, 80}, pooling size (convnet) {2, 3, 4}, and number of control points (symnet) {5, 10, 20}. |