Shaping Belief States with Generative Environment Models for RL
Authors: Karol Gregor, Danilo Jimenez Rezende, Frederic Besse, Yan Wu, Hamza Merzic, Aaron van den Oord
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the model substantially improves data-efficiency on a number of reinforcement learning (RL) tasks compared with strong model-free baseline agents. |
| Researcher Affiliation | Industry | Google DeepMind London, UK {karolg, danilor, fbesse, yanwu, hamzamerzic, avdnoord}@google.com |
| Pseudocode | Yes | A concrete example of the computation of the model s loss is provided as pseudo-code in Appendix K. |
| Open Source Code | No | The paper mentions 'supplementary video' but does not provide a specific repository link or explicit statement about the release of source code for the described methodology. |
| Open Datasets | Yes | Our experiments are performed using four families of procedural environments: (a) Deep Mind-Lab levels [64] and three new environments that we created using the Unity Engine: (b) Random City; (c) Block building environment; (d) Random Terrain. |
| Dataset Splits | No | The paper does not explicitly provide specific training/validation/test dataset splits (e.g., exact percentages, sample counts, or citations to predefined splits) needed for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions software like 'IMPALA framework [19]' and 'Adam for optimization [63]' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The detailed choice of various hyperparameters is provided in Appendix F. |