Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Data Generation as Sequential Decision Making
Authors: Philip Bachman, Doina Precup
NeurIPS 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We tested the performance of our sequential imputation models on three datasets: MNIST (28x28), SVHN (cropped, 32x32) [13], and TFD (48x48) [17]. We measured the imputation log-likelihood log q(xu|cu T ) using the true missing values xu and the models guesses given by ฯ(cu T ). We report negative log-likelihoods, so lower scores are better in all of our tests. Fig. 2 and Tab. 1 present quantitative results from these tests. |
| Researcher Affiliation | Academia | Philip Bachman Mc Gill University, School of Computer Science EMAIL Doina Precup Mc Gill University, School of Computer Science EMAIL |
| Pseudocode | Yes | The supplementary material provides pseudo-code and an illustration for this model. |
| Open Source Code | Yes | Model/test code is available at http://github.com/Philip-Bachman/Sequential-Generation. Full implementations and test code are available from http:// github.com/Philip-Bachman/Sequential-Generation. |
| Open Datasets | Yes | We tested the performance of our sequential imputation models on three datasets: MNIST (28x28), SVHN (cropped, 32x32) [13], and TFD (48x48) [17]. |
| Dataset Splits | No | The paper describes data masking strategies and mentions using a 'held-out test set' but does not specify train/validation/test splits with percentages or sample counts for the datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions no specific software dependencies or versions (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Except where noted, the GPSI models used 6 re๏ฌnement steps and the LSTM models used 16. We tested imputation under two types of data masking: missing completely at random (MCAR) and missing at random (MAR). In MCAR, we masked pixels uniformly at random from the source images, and indicate removal of d% of the pixels by MCAR-d. In MAR, we masked square regions, with the occlusions located uniformly at random within the borders of the source image. |