Neural Network Control Policy Verification With Persistent Adversarial Perturbation

Authors: Yuh-Shyang Wang, Lily Weng, Luca Daniel

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we demonstrate our algorithm on a cart-pole example and show the following results: Our proposed framework (Theorem 1 and Algorithm 1) outperforms methods based on traditional robust control theory (Lemma 1). Specifically, Algorithm 1 can certify the boundedness of the closed loop system with attack level that is 5 times larger than that of a robust control approach. See the result in Fig. 2a.
Researcher Affiliation Collaboration 1Argo AI, Pittsburgh, Pennsylvania, USA 2Department of EECS, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
Pseudocode Yes Algorithm 1 Certification of state and control constraints under persistent adversarial perturbation
Open Source Code No The paper mentions using 'stable baselines' and 'Open-AI gym' but does not provide a link or statement for the open-sourcing of their own code implementation of the described methodology.
Open Datasets Yes We use proximal policy optimization (Schulman et al., 2017) in stable baselines (Hill et al., 2018) to train a 3-layer neural network policy for the cartpole problem in Open-AI gym (Brockman et al., 2016).
Dataset Splits No The paper trains a policy in a reinforcement learning setup but does not specify explicit dataset splits (e.g., percentages or counts) for training, validation, or testing.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory, or cloud instance types).
Software Dependencies No The paper mentions 'proximal policy optimization (Schulman et al., 2017) in stable baselines (Hill et al., 2018) to train a 3-layer neural network policy for the cartpole problem in Open-AI gym (Brockman et al., 2016)', but it does not provide specific version numbers for these software components.
Experiment Setup Yes Experiment setup. We use proximal policy optimization (Schulman et al., 2017) in stable baselines (Hill et al., 2018) to train a 3-layer neural network policy for the cartpole problem in Open-AI gym (Brockman et al., 2016). Our neural network has 16 neurons per hidden layer, Re LU activations, and continuous control output. The policy is trained with 2M steps. We use the neural network certification framework developed in (Weng et al., 2018; Zhang et al., 2018) for Algorithm 1.