reproducibilityindex.ai

The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods

Authors: Louis THIRY, Michael Arbel, Eugene Belilovsky, Edouard Oyallon

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We train shallow classiﬁers, i.e. linear classiﬁer and 1-hidden layer CNN (1-layer) on top of our representation Φ on two major image classiﬁcation datasets, CIFAR-10 and Image Net, which consist respectively of 50k small and 1.2M large color images divided respectively into 10 and 1k classes. For training, we systematically used mini-batch SGD with momentum of 0.9, no weight decay and using the cross-entropy loss.
Researcher Affiliation	Academia	Louis Thiry Département d Informatique de l ENS ENS, CNRS, PSL University Paris, France louis.thiry@ens.fr Michael Arbel Gatsby Computational Neuroscience Unit University College London London, United Kingdom michael.n.arbel@gmail.com Eugene Belilovsky Concordia University and Mila Montreal, Canada eugene.belilovsky@concordia.ca Edouard Oyallon CNRS, LIP6, Sorbonne University Paris, France edouard.oyallon@lip6.fr
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks were found.
Open Source Code	Yes	Our code as well as commands to reproduce our results are available here: https://github.com/louity/patches.
Open Datasets	Yes	We train shallow classiﬁers, i.e. linear classiﬁer and 1-hidden layer CNN (1-layer) on top of our representation Φ on two major image classiﬁcation datasets, CIFAR-10 and Image Net, which consist respectively of 50k small and 1.2M large color images divided respectively into 10 and 1k classes.
Dataset Splits	No	While the paper mentions training and testing on CIFAR-10 and ImageNet, it does not explicitly provide specific training/validation/test splits (e.g., percentages or exact counts for each split) or refer to a standard split that includes a validation set. It only states the total dataset sizes.
Hardware Specification	No	The paper mentions general hardware support like "GPU donation from NVIDIA" and "HPC resources of IDRIS" but does not specify exact GPU models (e.g., NVIDIA A100, Tesla V100) or detailed CPU/cluster specifications used for running experiments.
Software Dependencies	No	The paper describes the methods and techniques used (e.g., mini-batch SGD, cross-entropy loss, batch-normalization), but does not provide specific software package names with version numbers (e.g., "PyTorch 1.9", "Python 3.8").
Experiment Setup	Yes	For training, we systematically used mini-batch SGD with momentum of 0.9, no weight decay and using the cross-entropy loss. The classiﬁer is trained for 175 epoch with a learning rate decay of 0.1 at epochs 100 and 150. The initial learning rate is 0.003 for \|D\| = 2k and 0.001 for larger \|D\|. For the linear classiﬁcation experiments, we used an average pooling of size k1 = 5 and stride s1 = 3, k2 = 1 and c2 = 128 for the ﬁrst convolutional operator and k3 = 6 for the second one. We set the patch size to P = 6 and the whitening regularization to λ = 10 2. The parameters of the linear convolutional classiﬁer are chosen to be: k1 = 10, s1 = 6, k2 = 1, c2 = 256, k3 = 7.