Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design
Authors: Michael Dennis, Natasha Jaques, Eugene Vinitsky, Alexandre Bayen, Stuart Russell, Andrew Critch, Sergey Levine
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments. 5 Experiments |
| Researcher Affiliation | Collaboration | University of California Berkeley AI Research (BAIR), Berkeley, CA, 94704 2Google Research, Brain team, Mountain View, CA, 94043 |
| Pseudocode | Yes | Algorithm 1: PAIRED. |
| Open Source Code | Yes | The code for PAIRED and our experiments is available in open source at https://github.com/google-research/ google-research/tree/master/social_rl/. |
| Open Datasets | Yes | Here we investigate navigation tasks (based on [9]), in which an agent must explore to find a goal (green square in Figure 1) while navigating around obstacles. [9] Maxime Chevalier-Boisvert, Lucas Willems, and Suman Pal. Minimalistic gridworld environment for openai gym. https://github.com/maximecb/gym-minigrid, 2018. To compare more closely with prior work on minimax adversarial RL [28, 43], we construct an additional experiment in a modified version of the Mu Jo Co hopper domain [42]. [42] Emanuel Todorov, Tom Erez, and Yuval Tassa. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026 5033. IEEE, 2012. |
| Dataset Splits | No | Parameters for the emergent complexity task are selected to maximize the solved path length, and parameters for the transfer task are selected using a set of validation environments. While validation environments are mentioned, no specific dataset split information (percentages, counts, or explicit methodology for fixed datasets) is provided. The environments are *generated* rather than split from a pre-existing dataset. |
| Hardware Specification | No | The paper mentions 'funding computation expenses associated with this work' but does not specify any hardware details such as GPU/CPU models or specific computing resources used for experiments. |
| Software Dependencies | No | All agents are trained with PPO [35]. The paper refers to algorithms and environments by name (PPO, OpenAI Gym, MuJoCo) and cites papers for them, but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | Further details about network architecture and hyperparameters are given in Appendix F. All agents are trained with PPO [35]. |