reproducibilityindex.ai

Decoupling Value and Policy for Generalization in Reinforcement Learning

Authors: Roberta Raileanu, Rob Fergus

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	IDAAC shows good generalization to unseen environments, achieving a new state-of-the-art on the Procgen benchmark and outperforming popular methods on Deep Mind Control tasks with distractors.
Researcher Affiliation	Academia	Roberta Raileanu 1 Rob Fergus 1 1Deptartment of Computer Science, New York University, New York, USA. Correspondence to: Roberta Raileanu <raileanu@cs.nyu.edu>.
Pseudocode	Yes	See Algorithm 1 from Appendix B for a more detailed description of DAAC. See Algorithm 2 from Appendix B for a more detailed description of IDAAC.
Open Source Code	Yes	Our implementation is available at https://github.com/ rraileanu/idaac.
Open Datasets	Yes	In practice, we use the Procgen benchmark which contains 16 procedurally generated games. ... We use three tasks, namely Cartpole Balance, Cartpole Swingup, and Ball In Cup.
Dataset Splits	No	Following the setup from Cobbe et al. (2019), agents are trained on a ﬁxed set of n = 200 levels (generated using seeds from 1 to 200) and tested on the full distribution of levels (generated using any computer integer seed). (No explicit mention of a validation split).
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions software like Adam and refers to 'Pytorch implementations of reinforcement learning algorithms' but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	More details about our experimental setup and hyperparameters can be found in Appendix C.