Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks

Authors: Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on several datasets show that gather-excite can bring benefits comparable to increasing the depth of a CNN at a fraction of the cost.
Researcher Affiliation Collaboration Jie Hu Momenta hujie@momenta.ai; Li Shen Visual Geometry Group University of Oxford lishen@robots.ox.ac.uk; Samuel Albanie Visual Geometry Group University of Oxford albanie@robots.ox.ac.uk; Gang Sun Momenta sungang@momenta.ai; Andrea Vedaldi Visual Geometry Group University of Oxford vedaldi@robots.ox.ac.uk
Pseudocode No The paper includes diagrams illustrating the gather-excite operator (Fig. 1), but no formal pseudocode or algorithm blocks.
Open Source Code Yes The code for all models used in this work is publicly available at https://github.com/hujie-frank/GENet.
Open Datasets Yes To compare the utility of each design, we conduct a series of experiments on the task of image classification using the Image Net 1K dataset [33]... We conduct additional experiments on the CIFAR-10 and CIFAR-100 image classification benchmarks [19]... For this purpose, we train an object detector on MS COCO [25].
Dataset Splits Yes The dataset contains 1.2 million training images and 50k validation images. In the experiments that follow, all models are trained on the training set and evaluated on the validation set. ... Each contains 50k train images and 10k test images. ... MS COCO [25], a dataset which has approximately 80k training images and 40k validation images (we use the train-val splits provided in the 2014 release).
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, such as deep learning frameworks (e.g., TensorFlow, PyTorch) or other libraries.
Experiment Setup Yes These models are trained from random initialisation [10] using SGD with momentum 0.9 with minibatches of 256 images, each cropped to 224 224 pixels. The initial learning rate is set to 0.1 and is reduced by a factor of 10 each time the loss plateaus (three times). Models typically train for approximately 300 epochs in total.