Neural Lyapunov Control for Discrete-Time Systems

Authors: Junlin Wu, Andrew Clark, Yiannis Kantaros, Yevgeniy Vorobeychik

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on four standard benchmarks demonstrate that our approach significantly outperforms state-of-the-art baselines. For example, on the path tracking benchmark, we outperform recent neural Lyapunov control baselines by an order of magnitude in both running time and the size of the region of attraction, and on two of the four benchmarks (cartpole and PVTOL), ours is the first automated approach to return a provably stable controller.
Researcher Affiliation Academia Junlin Wu Computer Science & Engineering Washington University in St. Louis St. Louis, MO 63130 junlin.wu@wustl.edu Andrew Clark Electrical & Systems Engineering Washington University in St. Louis St. Louis, MO 63130 andrewclark@wustl.edu Yiannis Kantaros Electrical & Systems Engineering Washington University in St. Louis St. Louis, MO 63130 ioannisk@wustl.edu Yevgeniy Vorobeychik Computer Science & Engineering Washington University in St. Louis St. Louis, MO 63130 yvorobeychik@wustl.edu
Pseudocode Yes Algorithm 1 DITL Lyapunov learning algorithm. 1: Input: Dynamical system model f(x, u) and target valid region R(γ) 2: Output: Lyapunov function Vθ and control policy πβ
Open Source Code Yes Our code is available at: https://github.com/jlwu002/nlc_discrete.
Open Datasets Yes Our evaluation of the proposed DITL approach uses four benchmark control domains: inverted pendulum, path tracking, cartpole, and drone planar vertical takeoff and landing (PVTOL). Details about these domains are provided in the Supplement.
Dataset Splits No The paper describes the 'valid region' for stability analysis but does not specify dataset splits (e.g., percentages or counts) for training, validation, and test sets in the typical machine learning context.
Hardware Specification Yes For inverted pendulum and path tracking, all comparisons were performed on a machine with AMD Ryzen 9 5900X 12-Core Processor and Linux Ubuntu 20.04.5 LTS OS. All cartpole and PVTOL experiments were run on a machine with a Xeon Gold 6150 CPU (64-bit 18-core x86), Rocky Linux 8.6. UNL and RL training for Path Tracking are the only two cases that make use of GPUs, and was run on NVIDIA Ge Force RTX 3090.
Software Dependencies Yes LQR solutions (when the system can be stabilized) are obtained through matrix multiplication, while SOS entails solving a semidefinite program (we solve it using YALMIP with MOSEK solver in MATLAB 2022b). ... For DITL verification we used CPLEX version 22.1.0. ... We use the implementation in Stable-Baselines [Raffin, 2020] for PPO training.
Experiment Setup Yes For the inverted pendulum domain, we initialize the control policy using the LQR solution (see the Supplement for details). We train Vθ with non-zero bias terms. We set ϵ = 0.1 (< 0.007% of the valid region), and approximate ROA using a grid of 2000 cells along each coordinate using the level set certified with MILP (4). ... The final hyperparameters are in Table 3.