Robust Deep Reinforcement Learning through Adversarial Loss

Authors: Tuomas Oikarinen, Wang Zhang, Alexandre Megretski, Luca Daniel, Tsui-Wei Weng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment on three deep RL benchmarks (Atari, Mu Jo Co and Proc Gen) to show the effectiveness of our robust training algorithm. Our RADIAL-RL agents consistently outperform prior methods when tested against attacks of varying strength and are more computationally efficient to train.
Researcher Affiliation Academia Tuomas Oikarinen UC San Diego CSE Wang Zhang MIT Mech E Alexandre Megretski MIT EECS Luca Daniel MIT EECS Tsui-Wei Weng UC San Diego HDSI
Pseudocode Yes Algorithm 1: Greedy Worst-Case Reward
Open Source Code Yes All code used for our experiments is available at https://github.com/tuomaso/radial_rl_v2.
Open Datasets Yes We experiment on the same Atari-2600 environment [35] and same 4 games used in [17, 18]. Different from [17, 18], we further evaluate our algorithm on a more challenging Proc Gen [36] benchmark... we use the Mu Jo Co environment [37].
Dataset Splits No The paper mentions 'train/evaluation split in levels' and uses 'evaluation set' for reporting results, but does not explicitly specify a distinct 'validation' dataset split with percentages or counts.
Hardware Specification No The paper mentions 'on our hardware' but does not specify any particular GPU, CPU, or other hardware component models or detailed specifications used for the experiments.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the implementation, such as Python, PyTorch, or TensorFlow.
Experiment Setup Yes Full training details and hyperparameter settings are available in Appendix H.