Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Data Generation as Sequential Decision Making

Authors: Philip Bachman, Doina Precup

NeurIPS 2015 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We tested the performance of our sequential imputation models on three datasets: MNIST (28x28), SVHN (cropped, 32x32) [13], and TFD (48x48) [17]. We measured the imputation log-likelihood log q(xu|cu T ) using the true missing values xu and the models guesses given by ฯƒ(cu T ). We report negative log-likelihoods, so lower scores are better in all of our tests. Fig. 2 and Tab. 1 present quantitative results from these tests.
Researcher Affiliation Academia Philip Bachman Mc Gill University, School of Computer Science EMAIL Doina Precup Mc Gill University, School of Computer Science EMAIL
Pseudocode Yes The supplementary material provides pseudo-code and an illustration for this model.
Open Source Code Yes Model/test code is available at http://github.com/Philip-Bachman/Sequential-Generation. Full implementations and test code are available from http:// github.com/Philip-Bachman/Sequential-Generation.
Open Datasets Yes We tested the performance of our sequential imputation models on three datasets: MNIST (28x28), SVHN (cropped, 32x32) [13], and TFD (48x48) [17].
Dataset Splits No The paper describes data masking strategies and mentions using a 'held-out test set' but does not specify train/validation/test splits with percentages or sample counts for the datasets.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper mentions no specific software dependencies or versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Except where noted, the GPSI models used 6 re๏ฌnement steps and the LSTM models used 16. We tested imputation under two types of data masking: missing completely at random (MCAR) and missing at random (MAR). In MCAR, we masked pixels uniformly at random from the source images, and indicate removal of d% of the pixels by MCAR-d. In MAR, we masked square regions, with the occlusions located uniformly at random within the borders of the source image.