Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stability Verification in Stochastic Control Systems via Neural Network Supermartingales

Authors: Mathias Lechner, Đorđe Žikelić, Krishnendu Chatterjee, Thomas A. Henzinger7326-7336

AAAI 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we validate our approach experimentally on a set of nonlinear stochastic reinforcement learning environments with neural network policies.
Researcher Affiliation Academia IST Austria, Klosterneuburg, Austria EMAIL
Pseudocode Yes Algorithm 1: Verification of a.s. asymptotic stability
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for their methodology is publicly available.
Open Datasets No The paper uses two benchmark environments: a two-dimensional dynamical system and a stochastic variant of the inverted pendulum problem. However, it does not explicitly provide access information (link, DOI, formal citation) to a pre-collected, publicly available dataset used for training.
Dataset Splits No The paper describes training policies within RL environments and then verifying them. It does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) for a pre-collected dataset, as data is generated through interaction with the environment during policy training.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions using 'proximal policy optimization' and 'Open AI Gym' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup Yes Our RSM neural networks consist of one hidden layer with 128 Re LU units. For each RL task, we consider the state space X = {x | ||x||1 0.5} and train a control policy comprised of two hidden layers with 128 Re LU units each by using proximal policy optimization (Schulman et al. 2017), while applying our Lipschitz regularization to keep the Lipschitz constant of the policy within a reasonable bound. We then run our algorithm to verify that the region Xs = {x | ||x||1 0.2} is a.s. asymptotically stable.Input Dynamics function f, policy π, disturbance distribution d, region Xs X, Lipschitz constants Lf, Lπ parameters τ > 0, N N, λ > 0