reproducibilityindex.ai

Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet

Authors: Wieland Brendel, Matthias Bethge

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our model, a simple variant of the Res Net50 architecture called Bag Net, classiﬁes an image based on the occurrences of small local image features without taking into account their spatial ordering. This strategy is closely related to the bag-of-feature (Bo F) models popular before the onset of deep learning and reaches a surprisingly high accuracy on Image Net (87.6% top-5 for 33 33 px features and Alexnet performance for 17 17 px features). The constraint on local features makes it straight-forward to analyse how exactly each part of the image inﬂuences the classiﬁcation. Furthermore, the Bag Nets behave similar to state-of-the art deep neural networks such as VGG-16, Res Net-152 or Dense Net-169 in terms of feature sensitivity, error distribution and interactions between image parts.
Researcher Affiliation	Academia	Wieland Brendel and Matthias Bethge Eberhard Karls University of Tübingen, Germany Werner Reichardt Centre for Integrative Neuroscience, Tübingen, Germany Bernstein Center for Computational Neuroscience, Tübingen, Germany {wieland.brendel, matthias.bethge}@bethgelab.org
Pseudocode	No	The paper includes a figure detailing the network architecture but no formal pseudocode or algorithm blocks.
Open Source Code	Yes	We released the pretrained Bag Nets (Bag Net-9, Bag Net17 and Bag Net-33) for Py Torch and Keras at https://github.com/wielandbrendel/bag-of-local-features-models.
Open Datasets	Yes	We train the Bag Nets directly on Image Net (see Appendix for details).
Dataset Splits	Yes	Training of the models was performed in Py Torch using the default Image Net training script of Torchvision (https://github.com/ pytorch/vision, commit 8a4786a) with default parameters. In brief, we used SGD with momentum (0.9), a batchsize of 256 and an initial learning rate of 0.01 which we decreased by a factor of 10 every 30 epochs. Images were resized to 256 pixels (shortest side) after which we extracted a random crop of size 224 224 pixels.
Hardware Specification	No	The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU types).
Software Dependencies	No	The paper mentions "Py Torch" and "Keras" and refers to a "Torchvision" commit, but does not specify version numbers for these key software components.
Experiment Setup	Yes	In brief, we used SGD with momentum (0.9), a batchsize of 256 and an initial learning rate of 0.01 which we decreased by a factor of 10 every 30 epochs.