Learning Wake-Sleep Recurrent Attention Models
Authors: Jimmy Ba, Russ R. Salakhutdinov, Roger B. Grosse, Brendan J. Frey
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To measure the effectiveness of the proposed WS-RAM method, we first investigated a toy classification task involving a variant of the MNIST handwritten digits dataset [25] where transformations were applied to the images. We then evaluated the proposed method on a substantially more difficult image caption generation task using the Flickr8k [26] dataset. |
| Researcher Affiliation | Academia | Jimmy Ba University of Toronto jimmy@psi.toronto.edu Roger Grosse University of Toronto rgrosse@cs.toronto.edu Ruslan Salakhutdinov University of Toronto rsalskhu@cs.toronto.edu Brendan Frey University of Toronto frey@psi.toronto.edu |
| Pseudocode | No | The paper describes algorithms and derivations mathematically but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks or figures. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the methodology described in the paper, nor does it include links to a code repository. |
| Open Datasets | Yes | We generated a dataset of randomly translated and scaled handwritten digits from the MNIST dataset [25], and "We report results on the widely-used Flickr8k dataset. The training/valid/test split followed the same protocol as used in previous work [28]." Both MNIST and Flickr8k are well-known public datasets. |
| Dataset Splits | Yes | We report results on the widely-used Flickr8k dataset. The training/valid/test split followed the same protocol as used in previous work [28]. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running experiments, such as GPU or CPU models, or cloud computing specifications. |
| Software Dependencies | No | The paper mentions that "All networks were trained using Adam [27]," but it does not provide specific version numbers for Adam or any other software dependencies or libraries used for the implementation or experiments. |
| Experiment Setup | No | The paper describes general aspects of the experimental setup, such as using "Re LU units" and training with "Adam," and mentions that "the learning rate set to the highest value that allowed the model to successfully converge," but it does not provide specific numerical values for hyperparameters like the exact learning rate, batch size, or number of epochs. |