reproducibilityindex.ai

Robust Deep Reinforcement Learning through Adversarial Loss

Authors: Tuomas Oikarinen, Wang Zhang, Alexandre Megretski, Luca Daniel, Tsui-Wei Weng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experiment on three deep RL benchmarks (Atari, Mu Jo Co and Proc Gen) to show the effectiveness of our robust training algorithm. Our RADIAL-RL agents consistently outperform prior methods when tested against attacks of varying strength and are more computationally efﬁcient to train.
Researcher Affiliation	Academia	Tuomas Oikarinen UC San Diego CSE Wang Zhang MIT Mech E Alexandre Megretski MIT EECS Luca Daniel MIT EECS Tsui-Wei Weng UC San Diego HDSI
Pseudocode	Yes	Algorithm 1: Greedy Worst-Case Reward
Open Source Code	Yes	All code used for our experiments is available at https://github.com/tuomaso/radial_rl_v2.
Open Datasets	Yes	We experiment on the same Atari-2600 environment [35] and same 4 games used in [17, 18]. Different from [17, 18], we further evaluate our algorithm on a more challenging Proc Gen [36] benchmark... we use the Mu Jo Co environment [37].
Dataset Splits	No	The paper mentions 'train/evaluation split in levels' and uses 'evaluation set' for reporting results, but does not explicitly specify a distinct 'validation' dataset split with percentages or counts.
Hardware Specification	No	The paper mentions 'on our hardware' but does not specify any particular GPU, CPU, or other hardware component models or detailed specifications used for the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used in the implementation, such as Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	Full training details and hyperparameter settings are available in Appendix H.