reproducibilityindex.ai

Multiple Object Recognition with Visual Attention

Authors: Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the model on the challenging task of transcribing house number sequences from Google Street View images and show that it is both more accurate than the state-of-the-art convolutional networks and uses fewer parameters and less computation.
Researcher Affiliation	Collaboration	Jimmy Lei Ba University of Toronto jimmy@psi.utoronto.ca Volodymyr Mnih Google Deep Mind vmnih@google.com Koray Kavukcuoglu Google Deep Mind korayk@google.com
Pseudocode	No	The paper describes the model and learning process using mathematical equations and textual descriptions, but it does not provide formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or a direct link to the open-source code for the described methodology.
Open Datasets	Yes	The publicly available multi-digit street view house number (SVHN) dataset Netzer et al. (2011) consists of images of digits taken from pictures of house fronts. ... The models are trained using the remaining 200,000 training images.
Dataset Splits	Yes	Following Goodfellow et al. (2013), we formed a validation set of 5000 images by randomly sampling images from the training set and the extra set, and these were used for selecting the learning rate and sampling variance for the stochastic glimpse policy.
Hardware Specification	No	The paper mentions training on 'a GPU' but does not provide specific hardware details such as GPU model, CPU type, or memory specifications.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., library names like TensorFlow or PyTorch with their respective versions).
Experiment Setup	Yes	We optimized the model parameters using stochastic gradient descent with the Nesterov momentum technique. A mini-batch size of 128 was used to estimate the gradient direction. The momentum coefﬁcient was set to 0.9 throughout the training. The learning rate η scheduling was applied in training to improve the convergence of the learning process. η starts at 0.01 in the ﬁrst epoch and was exponentially reduced by a factor of 0.97 after each epoch.