Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Shaping Belief States with Generative Environment Models for RL

Authors: Karol Gregor, Danilo Jimenez Rezende, Frederic Besse, Yan Wu, Hamza Merzic, Aaron van den Oord

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the model substantially improves data-ef๏ฌciency on a number of reinforcement learning (RL) tasks compared with strong model-free baseline agents.
Researcher Affiliation Industry Google DeepMind London, UK EMAIL
Pseudocode Yes A concrete example of the computation of the model s loss is provided as pseudo-code in Appendix K.
Open Source Code No The paper mentions 'supplementary video' but does not provide a specific repository link or explicit statement about the release of source code for the described methodology.
Open Datasets Yes Our experiments are performed using four families of procedural environments: (a) Deep Mind-Lab levels [64] and three new environments that we created using the Unity Engine: (b) Random City; (c) Block building environment; (d) Random Terrain.
Dataset Splits No The paper does not explicitly provide specific training/validation/test dataset splits (e.g., exact percentages, sample counts, or citations to predefined splits) needed for reproduction.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software like 'IMPALA framework [19]' and 'Adam for optimization [63]' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes The detailed choice of various hyperparameters is provided in Appendix F.