Learning Wake-Sleep Recurrent Attention Models

Authors: Jimmy Ba, Russ R. Salakhutdinov, Roger B. Grosse, Brendan J. Frey

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To measure the effectiveness of the proposed WS-RAM method, we first investigated a toy classification task involving a variant of the MNIST handwritten digits dataset [25] where transformations were applied to the images. We then evaluated the proposed method on a substantially more difficult image caption generation task using the Flickr8k [26] dataset.
Researcher Affiliation Academia Jimmy Ba University of Toronto jimmy@psi.toronto.edu Roger Grosse University of Toronto rgrosse@cs.toronto.edu Ruslan Salakhutdinov University of Toronto rsalskhu@cs.toronto.edu Brendan Frey University of Toronto frey@psi.toronto.edu
Pseudocode No The paper describes algorithms and derivations mathematically but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks or figures.
Open Source Code No The paper does not provide any explicit statements about releasing source code for the methodology described in the paper, nor does it include links to a code repository.
Open Datasets Yes We generated a dataset of randomly translated and scaled handwritten digits from the MNIST dataset [25], and "We report results on the widely-used Flickr8k dataset. The training/valid/test split followed the same protocol as used in previous work [28]." Both MNIST and Flickr8k are well-known public datasets.
Dataset Splits Yes We report results on the widely-used Flickr8k dataset. The training/valid/test split followed the same protocol as used in previous work [28].
Hardware Specification No The paper does not provide any specific details about the hardware used for running experiments, such as GPU or CPU models, or cloud computing specifications.
Software Dependencies No The paper mentions that "All networks were trained using Adam [27]," but it does not provide specific version numbers for Adam or any other software dependencies or libraries used for the implementation or experiments.
Experiment Setup No The paper describes general aspects of the experimental setup, such as using "Re LU units" and training with "Adam," and mentions that "the learning rate set to the highest value that allowed the model to successfully converge," but it does not provide specific numerical values for hyperparameters like the exact learning rate, batch size, or number of epochs.