reproducibilityindex.ai

Exploiting Cyclic Symmetry in Convolutional Neural Networks

Authors: Sander Dieleman, Jeffrey De Fauw, Koray Kavukcuoglu

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effect of these architectural modiﬁcations on three datasets which exhibit rotational symmetry and demonstrate improved performance with smaller models. ... 6. Experiments ... 6.1. Datasets ... 6.2. Experimental setup ... Table 2. Number of model parameters and results on the plankton dataset (cross-entropy, lower is better).
Researcher Affiliation	Industry	Sander Dieleman SEDIELEM@GOOGLE.COM Jeffrey De Fauw DEFAUW@GOOGLE.COM Koray Kavukcuoglu KORAYK@GOOGLE.COM Google Deep Mind
Pseudocode	No	The paper describes operations and provides a table (Table 1) summarizing them, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	A fast GPU implementation of the rolling operation for Theano (using CUDA kernels) is available at https://github.com/benanne/kaggle-ndsb.
Open Datasets	Yes	The Plankton dataset (Cowen et al., 2015) consists of 30,336 grayscale images... The Galaxies dataset consists of 61,578 colour images... The Massachusetts buildings dataset (Mnih, 2013) consists of 1500 1500 aerial images...
Dataset Splits	Yes	We split this set into separate validation and training sets of 3,037 and 27,299 images respectively. [Plankton] ... We split the dataset into a validation set of 6,157 images and a training set of 55,421 images. [Galaxies] ... it was split into a training set of 137 images, a validation set of 4 images and a test set of 10 images. [Massachusetts buildings]
Hardware Specification	No	The paper mentions 'A fast GPU implementation' but does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions 'Theano (using CUDA kernels)' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We use the Adam optimisation method (Kingma & Ba, 2014) for all experiments, because it allows us to avoid retuning learning rates when cyclic layers are inserted. We use discrete learning rate schedules with tenfold decreases near the end of training, following Krizhevsky et al. (2012). For the plankton dataset we also use weight decay for additional regularisation. We use data augmentation to reduce overﬁtting, including random rotation between 0 and 360 . ... We reduced the batch size used for training by a factor of 4.