Recurrent Models of Visual Attention

Authors: Volodymyr Mnih, Nicolas Heess, Alex Graves, koray kavukcuoglu

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so.
Researcher Affiliation Industry Volodymyr Mnih Nicolas Heess Alex Graves Koray Kavukcuoglu Google Deep Mind {vmnih,heess,gravesa,korayk} @ google.com
Pseudocode No The paper describes the model architecture and training procedure in text and diagrams but does not include explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a link or statement about open-sourcing the code for the described methodology.
Open Datasets Yes We first tested the ability of our training method to learn successful glimpse policies by using it to train RAM models with up to 7 glimpses on the MNIST digits dataset.
Dataset Splits No The paper mentions using a 'test set' but does not specify explicit training, validation, or test split percentages or counts for the datasets used.
Hardware Specification No The paper mentions training on 'multiple GPUs' but does not provide specific hardware details such as GPU models, CPU specifications, or memory.
Software Dependencies No The paper describes software components conceptually (e.g., 'stochastic gradient descent') but does not specify software names with version numbers.
Experiment Setup Yes All methods were trained using stochastic gradient descent with minibatches of size 20 and momentum of 0.9. We annealed the learning rate linearly from its initial value to 0 over the course of training. Hyperparameters such as the initial learning rate and the variance of the location policy were selected using random search [3].