reproducibilityindex.ai

Learning Strides in Convolutional Neural Networks

Authors: Rachid Riad, Olivier Teboul, David Grangier, Neil Zeghidour

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on audio and image classiﬁcation show the generality and effectiveness of our solution: we use Diff Stride as a drop-in replacement to standard downsampling layers and outperform them.
Researcher Affiliation	Collaboration	Rachid Riad 1, Olivier Teboul2, David Grangier2 & Neil Zeghidour2 1ENS, INRIA, INSERM, UPEC, PSL Research University 2Google Research rachid.riad@ens.fr, {teboul, grangier, neilz}@google.com
Pseudocode	Yes	All these steps are summarized by Algorithm 1 and illustrated on a single channel image in the Figure 1.
Open Source Code	Yes	We release our implementation of Diff Stride1. 1https://github.com/google-research/diffstride
Open Datasets	Yes	Experiments on audio and image classiﬁcation show the generality and effectiveness of our solution: we use Diff Stride as a drop-in replacement to standard downsampling layers and outperform them. In particular, we show that introducing our layer into a Res Net-18 architecture allows keeping consistent high performance on CIFAR10, CIFAR100 and Image Net even when training starts from poor random stride conﬁgurations.CIFAR10 consists of 32 × 32 images labeled in 10 classes with 6000 images per class. We use the ofﬁcial split, with 50,000 images for training and 10,000 images for testing.We also compare the Res Net-18 architectures on the Image Net dataset (Deng et al., 2009), which contains 1,000 classes. The models are trained on the ofﬁcial training split of the Imagenet dataset (1.28M images) and we report our results on the validation set (50k images).
Dataset Splits	Yes	CIFAR10 consists of 32 × 32 images labeled in 10 classes with 6000 images per class. We use the ofﬁcial split, with 50,000 images for training and 10,000 images for testing.The models are trained on the ofﬁcial training split of the Imagenet dataset (1.28M images) and we report our results on the validation set (50k images).
Hardware Specification	Yes	Table A.2: Per-step time and peak memory usage of Spectral Pooling and Diff Stride relative to strided convolutions, on a V100 GPU.
Software Dependencies	Yes	Moreover, we release Tensorﬂow 2.0 code for training a Pre-Act Res Net-18 with strided convolutions, spectral pooling or Diff Stride on CIFAR10 and CIFAR100, with Diff Stride being implemented as a stand-alone, reusable Keras layer.
Experiment Setup	Yes	We train on all datasets with stochastic gradient descent (SGD) (Bottou et al., 1998) with a learning rate of 0.1, a batch size of 256 and a momentum (Qian, 1999) of 0.9. On CIFAR, we train models for 400 epochs dividing the learning rate by 10 at 200 epochs and again by 10 at 300 epochs, with a weight decay of 5.10−3. For CIFAR, we apply random cropping on the input images and left-right random ﬂipping. On Image Net, we train with a weight decay of 1.10−3 for 90 epochs, dividing the learning rate by 10 at epochs 30, 60 and 80. We apply random cropping on the input images as in (Szegedy et al., 2015) and left-right random ﬂipping.