reproducibilityindex.ai

Reinforcement Learning with Augmented Data

Authors: Misha Laskin, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel, Aravind Srinivas

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform the ﬁrst extensive study of general data augmentations for RL on both pixel-based and state-based inputs, and introduce two new data augmentations random translate and random amplitude scale. We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods across common benchmarks. RAD sets a new state-of-the-art in terms of data-efﬁciency and ﬁnal performance on the Deep Mind Control Suite benchmark for pixel-based control as well as Open AI Gym benchmark for state-based control.
Researcher Affiliation	Academia	Michael Laskin UC Berkeley, Kimin Lee UC Berkeley, Adam Stooke UC Berkeley, Lerrel Pinto New York University, Pieter Abbeel UC Berkeley, Aravind Srinivas UC Berkeley
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Our RAD module and training code are available at https://www.github.com/Misha Laskin/rad.
Open Datasets	Yes	To this end, we utilize the Deep Mind Control Suite (DMControl) [22]... For DMControl experiments, we evaluate the data-efﬁciency by measuring the performance of our method at 100k ... and 500k ... simulator or environment steps. ... For this reason, we focus on the Open AI Proc Gen benchmarks [24] to investigate the generalization capabilities of RAD. ... For Open AI Gym experiments with proprioceptive inputs..., we compare to PETS [41]...
Dataset Splits	No	The paper specifies training steps (e.g., 100k, 500k environment steps) and test environments, but does not describe distinct training/validation/test dataset splits in the conventional sense for supervised learning, as is typical for reinforcement learning setups where agents interact with an environment rather than a fixed dataset split for validation.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models or CPU specifications.
Software Dependencies	No	The paper mentions algorithms and frameworks like SAC and PPO, but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup	Yes	A full list of hyperparameters is provided in Table 4 of Appendix E.