reproducibilityindex.ai

Replay-Guided Adversarial Environment Design

Authors: Minqi Jiang, Michael Dennis, Jack Parker-Holder, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments in Section 6 investigate the learning dynamics of PLR , REPAIRED, and their replay-free counterparts on a challenging maze domain and a novel continuous control UED setting based on the popular Car Racing environment [5]. In both of these highly distinct settings, our methods provide signiﬁcant improvements over PLR and PAIRED, producing agents that can perform out-of-distribution (OOD) generalization to a variety of human designed mazes and Formula 1 tracks.
Researcher Affiliation	Collaboration	Minqi Jiang UCL, FAIR Michael Dennis UC Berkeley Jack Parker-Holder University of Oxford Jakob Foerster FAIR Edward Grefenstette UCL, FAIR Tim Rocktäschel UCL, FAIR
Pseudocode	Yes	Algorithm 1: Robust PLR (PLR )
Open Source Code	Yes	We open source our methods at https://github.com/facebookresearch/dcd.
Open Datasets	No	The paper refers to environments like 'maze domain' and 'Car Racing environment' and states these are 'based on' or 'extended versions' of existing environments like 'Open AI Gym', but it does not provide concrete access information (link, DOI, formal citation with authors/year) for specific datasets used for training.
Dataset Splits	Yes	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix D.1 and D.2.
Hardware Specification	Yes	All experiments were run on a single NVIDIA GeForce RTX 2080 Ti GPU.
Software Dependencies	Yes	All experiments were implemented in Python 3.7.6.
Experiment Setup	Yes	We provide environment descriptions alongside model and hyperparameter choices in Appendix D.