Robust Deep Reinforcement Learning through Adversarial Loss
Authors: Tuomas Oikarinen, Wang Zhang, Alexandre Megretski, Luca Daniel, Tsui-Wei Weng
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experiment on three deep RL benchmarks (Atari, Mu Jo Co and Proc Gen) to show the effectiveness of our robust training algorithm. Our RADIAL-RL agents consistently outperform prior methods when tested against attacks of varying strength and are more computationally efficient to train. |
| Researcher Affiliation | Academia | Tuomas Oikarinen UC San Diego CSE Wang Zhang MIT Mech E Alexandre Megretski MIT EECS Luca Daniel MIT EECS Tsui-Wei Weng UC San Diego HDSI |
| Pseudocode | Yes | Algorithm 1: Greedy Worst-Case Reward |
| Open Source Code | Yes | All code used for our experiments is available at https://github.com/tuomaso/radial_rl_v2. |
| Open Datasets | Yes | We experiment on the same Atari-2600 environment [35] and same 4 games used in [17, 18]. Different from [17, 18], we further evaluate our algorithm on a more challenging Proc Gen [36] benchmark... we use the Mu Jo Co environment [37]. |
| Dataset Splits | No | The paper mentions 'train/evaluation split in levels' and uses 'evaluation set' for reporting results, but does not explicitly specify a distinct 'validation' dataset split with percentages or counts. |
| Hardware Specification | No | The paper mentions 'on our hardware' but does not specify any particular GPU, CPU, or other hardware component models or detailed specifications used for the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the implementation, such as Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | Full training details and hyperparameter settings are available in Appendix H. |