Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Stability Verification in Stochastic Control Systems via Neural Network Supermartingales
Authors: Mathias Lechner, Đorđe Žikelić, Krishnendu Chatterjee, Thomas A. Henzinger7326-7336
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we validate our approach experimentally on a set of nonlinear stochastic reinforcement learning environments with neural network policies. |
| Researcher Affiliation | Academia | IST Austria, Klosterneuburg, Austria EMAIL |
| Pseudocode | Yes | Algorithm 1: Verification of a.s. asymptotic stability |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for their methodology is publicly available. |
| Open Datasets | No | The paper uses two benchmark environments: a two-dimensional dynamical system and a stochastic variant of the inverted pendulum problem. However, it does not explicitly provide access information (link, DOI, formal citation) to a pre-collected, publicly available dataset used for training. |
| Dataset Splits | No | The paper describes training policies within RL environments and then verifying them. It does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) for a pre-collected dataset, as data is generated through interaction with the environment during policy training. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'proximal policy optimization' and 'Open AI Gym' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | Our RSM neural networks consist of one hidden layer with 128 Re LU units. For each RL task, we consider the state space X = {x | ||x||1 0.5} and train a control policy comprised of two hidden layers with 128 Re LU units each by using proximal policy optimization (Schulman et al. 2017), while applying our Lipschitz regularization to keep the Lipschitz constant of the policy within a reasonable bound. We then run our algorithm to verify that the region Xs = {x | ||x||1 0.2} is a.s. asymptotically stable.Input Dynamics function f, policy π, disturbance distribution d, region Xs X, Lipschitz constants Lf, Lπ parameters τ > 0, N N, λ > 0 |