Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
Authors: Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on several datasets show that gather-excite can bring benefits comparable to increasing the depth of a CNN at a fraction of the cost. |
| Researcher Affiliation | Collaboration | Jie Hu Momenta hujie@momenta.ai; Li Shen Visual Geometry Group University of Oxford lishen@robots.ox.ac.uk; Samuel Albanie Visual Geometry Group University of Oxford albanie@robots.ox.ac.uk; Gang Sun Momenta sungang@momenta.ai; Andrea Vedaldi Visual Geometry Group University of Oxford vedaldi@robots.ox.ac.uk |
| Pseudocode | No | The paper includes diagrams illustrating the gather-excite operator (Fig. 1), but no formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code for all models used in this work is publicly available at https://github.com/hujie-frank/GENet. |
| Open Datasets | Yes | To compare the utility of each design, we conduct a series of experiments on the task of image classification using the Image Net 1K dataset [33]... We conduct additional experiments on the CIFAR-10 and CIFAR-100 image classification benchmarks [19]... For this purpose, we train an object detector on MS COCO [25]. |
| Dataset Splits | Yes | The dataset contains 1.2 million training images and 50k validation images. In the experiments that follow, all models are trained on the training set and evaluated on the validation set. ... Each contains 50k train images and 10k test images. ... MS COCO [25], a dataset which has approximately 80k training images and 40k validation images (we use the train-val splits provided in the 2014 release). |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, such as deep learning frameworks (e.g., TensorFlow, PyTorch) or other libraries. |
| Experiment Setup | Yes | These models are trained from random initialisation [10] using SGD with momentum 0.9 with minibatches of 256 images, each cropped to 224 224 pixels. The initial learning rate is set to 0.1 and is reduced by a factor of 10 each time the loss plateaus (three times). Models typically train for approximately 300 epochs in total. |