Policy Smoothing for Provably Robust Reinforcement Learning
Authors: Aounon Kumar, Alexander Levine, Soheil Feizi
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on various environments like Cartpole, Pong, Freeway and Mountain Car show that our method can yield meaningful robustness guarantees in practice. |
| Researcher Affiliation | Academia | Aounon Kumar, Alexander Levine, Soheil Feizi Department of Computer Science University of Maryland College Park, USA aounon@umd.edu,{alevine0,sfeizi}@cs.umd.edu |
| Pseudocode | Yes | Algorithm 1: Empirical Attack on DQN Agents |
| Open Source Code | Yes | We supplement our work with accompanying code for reproducing the experimental results, as well as pre-trained models for a selection of the experiments. Details about setting hyper-parameters and the environments we test are included in the appendix. |
| Open Datasets | Yes | We tested on four standard environments: the classical cortrol problems Cartpole and Mountain Car and the Atari games Pong and Freeway. |
| Dataset Splits | Yes | Table 1: Training Hyperparameters for DQN models. Validation interval (steps) 100000 Validation episodes 100 |
| Hardware Specification | Yes | Each experiment is run on an NVIDIA 2080 Ti GPU. |
| Software Dependencies | No | We use DQN and DDPG implementations from the popular stable-baselines3 package (Raffin et al., 2019) |
| Experiment Setup | Yes | Table 1: Training Hyperparameters for DQN models. |