Learning Generative Models with Visual Attention

Authors: Charlie Tang, Nitish Srivastava, Ruslan Salakhutdinov

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We used two face datasets in our experiments. The first dataset is a frontal face dataset, called the Caltech Faces from 1999... We also used the CMU Multi-PIE dataset [30]... Fig. 5 shows the quantitative results of Intersection over Union (IOU) of the ground truth face box and the inferred face box... Table 1: Face localization accuracy... Table 2 shows the estimates of the variational lower-bounds on the average log-density...
Researcher Affiliation Academia Department of Computer Science University of Toronto Toronto, Ontario, Canada {tang,nitish,rsalakhu}@cs.toronto.edu
Pseudocode No No structured pseudocode or algorithm block is explicitly labeled or presented in the paper.
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets Yes We used two face datasets in our experiments. The first dataset is a frontal face dataset, called the Caltech Faces from 1999, collected by Markus Weber. ... We also used the CMU Multi-PIE dataset [30], which contains 337 subjects, captured under 15 viewpoints and 19 illumination conditions in four recording sessions for a total of more than 750,000 images.
Dataset Splits Yes We split the Caltech dataset into a training and a validation set. For the CMU faces, we first took 10% of the images as training cases for the Conv Net for approximate inference. The remaining 90% of the CMU faces are split into a training and validation set.
Hardware Specification No The paper does not provide specific details on the hardware used for experiments, such as GPU/CPU models or memory.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup No The paper does not provide specific hyperparameter values or detailed system-level training configurations in the main text.