Learning convex bounds for linear quadratic control policy synthesis

Authors: Jack Umenberger, Thomas B. Schön

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical simulations and stabilization of a real-world inverted pendulum are used to demonstrate the approach, with strong performance and robustness properties observed in both.
Researcher Affiliation Academia Jack Umenberger Department of Information Technology Uppsala University Sweden jack.umenberger@it.uu.se Thomas B. Schön Department of Information Technology Uppsala University Sweden thomas.schon@it.uu.se
Pseudocode Yes Algorithm 1 Optimization of Jc M(K) via semidefinite programing
Open Source Code No The paper does not provide an explicit statement or link for open-source code.
Open Datasets No To obtain problem data D, each rollout involves simulating (1), with the true parameters, for T = 6 time steps, excited by ut N (0, I) with x0 = 0. ... To generate training data, the superposition of a non-stabilizing control signal and a sinusoid of random frequency is applied to the rotary arm motor while the pendulum is inverted.
Dataset Splits No The paper describes how data was generated for training and mentions evaluation, but does not specify explicit train/validation/test splits (e.g., percentages or exact sample counts) for reproducibility.
Hardware Specification No The paper mentions 'real-world experiments on a rotary inverted pendulum, on real (i.e. physical) hardware (Quanser QUBE 2)', which describes the physical system being controlled, not the computing hardware (CPU, GPU, memory, etc.) used to run the experiments or train the models.
Software Dependencies No The paper does not provide specific software names with version numbers that would be needed for reproducibility.
Experiment Setup Yes To obtain problem data D, each rollout involves simulating (1), with the true parameters, for T = 6 time steps, excited by ut N (0, I) with x0 = 0. Note: to facilitate comparison with [17], we too shall assume that tr is known. Furthermore, for all experiments c will denote a 95% confidence region, as in [17]. ... We applied the worst-case, H2/H1, and proposed methods to optimize the LQ cost with Q = I and R = 1. To generate bounds A k Als Atrk2 and B k Bls Btrk2 for worst-case and H2/H1, we sample {Ai, Bi}5000 i=1 from a 95% confidence region of the posterior, using Gibbs sampling, and take A = maxi k Als Aik2 and B = maxi k Bls Bik2. The proposed method used 100 such samples for synthesis.