Inverse Abstraction of Neural Networks Using Symbolic Interpolation
Authors: Sumanth Dathathri, Sicun Gao, Richard M. Murray3437-3444
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we focus on knowledge/policy analysis and extraction for two control environments: cart-pole and swimmer. We show that for the multilayer perceptron (MLP) network policies trained through standard reinforcement learning algorithms, we can extract knowledge in the form of compact abstractions. For cart-pole, the extracted policy achieves a perfect score. Using the extracted policy we are able to formally verify/falsify certain complex safety properties. For swimmer, we show how high torque outputs are mapped to a compact representation in the input space. We believe these techniques will be important for analyzing learning-enabled components in control applications. |
| Researcher Affiliation | Academia | 1Computing and Mathematical Sciences, California Institute of Technology 2Computer Science and Engineering, University of California, San Diego |
| Pseudocode | Yes | Algorithm 1 Computing compact abstractions 1: procedure ABSTRACTION ROUTINE 2: Returns Simple Overapproximator of Pref(S) 3: Compute x XS, yf 1 ( x) = g1( x) 4: Compute l1, u1 and B1 5: for r=2 ...n do 6: Compute x XS, yf r ( x) = gr( x) 7: Compute lr, ur and construct Br 8: for r=n ...1 do 9: Construct φr 1 (See equation (2)) 10: Construct ξr Br (See equation (3)) 11: Compute Ir satisfying equations (4) and (5) 12: Set O(f,S) r = Ir return O(f,S) 1 |
| Open Source Code | No | The paper does not provide any links or explicit statements indicating that the source code for their proposed methods is publicly available. |
| Open Datasets | Yes | We train a neural network with 2-hidden layers for the problem with Deep-Q learning using the environment in (Open AI-Cart Pole-v0 2018). The controller we consider is a neural network with 2 hidden layers trained with proximal policy optimization (Schulman et al. 2017) on the Swimmer environment (Open AI-Swimmer-v2 2018). |
| Dataset Splits | No | The paper mentions training on “holdout data” and that the cart-pole network achieves a “perfect score of 200.0, averaged over 100 episodes”. However, it does not provide specific percentages or counts for training, validation, and test splits, nor does it explicitly mention a “validation” set with details. |
| Hardware Specification | Yes | The computations were performed on a 2.40GHz Quadcore machine with 16 GB of RAM. |
| Software Dependencies | No | The paper mentions using “z3” (De Moura and Bjørner 2008), the framework “PLNN-v” (Bunel et al. 2018), and “d Real” (Gao, Kong, and Clarke 2013). However, specific version numbers for these software dependencies are not provided, which is necessary for a reproducible description. |
| Experiment Setup | No | The paper states that for the cart-pole problem, a “neural network with 2-hidden layers for the problem with Deep-Q learning” was trained, and for the swimmer task, a “neural network with 2 hidden layers trained with proximal policy optimization (Schulman et al. 2017)” was used. It describes the general architecture and learning algorithms but does not provide specific hyperparameters such as learning rate, batch size, or number of epochs, which are crucial for reproducing the experimental setup. |