Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Exact Asymptotics for Linear Quadratic Adaptive Control
Authors: Feicheng Wang, Lucas Janson
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In simulations on both stable and unstable systems, we find that our asymptotic theory also describes the algorithm s finite-sample behavior remarkably well. Numerical validation of our theory We apply our algorithm to both a stable and an unstable simulated system to compare our asymptotic expressions to the performance metrics they characterize, and we find quite good agreement, even at very early time steps. 4. Experiments We verify our algorithm s performance in one stable and one unstable dynamical system. |
| Researcher Affiliation | Academia | Feicheng Wang EMAIL Department of Statistics Harvard University Cambridge, MA 02138-2901, USA Lucas Janson EMAIL Department of Statistics Harvard University Cambridge, MA 02138-2901, USA |
| Pseudocode | Yes | Algorithm 1 Stepwise Noisy Certainty Equivalent Control Require: Initial state x0, stabilizing control matrix K0, scalars Cx > 0, CK > K , τ 2 > 0, β [1/2, 1), and α > 3/2 when β = 1/2. 1: Let u0 = K0x0 + τw0 and u1 = K0x1 + τw1, with w0, w1 iid N(0, Id). 2: for t = 2, 3, . . . do 3: ( ˆAt 1, ˆBt 1) arg min (A ,B ) xk+1 A xk B uk 2 2 (5) and if stabilizable, plug them into the DARE (Eqs. 3 and 4) to compute ˆKt, otherwise set ˆKt = K0. 4: If xt > Cx log(t) or ˆKt > CK, reset ˆKt = K0. ut = ˆKtxt + ηt, ηt = τ q t (1 β) logα(t) wt, wt iid N(0, Id) (6) |
| Open Source Code | Yes | 5. Source code for reproducing our results can be found at https://github.com/Feicheng-Wang/LQAC_code. |
| Open Datasets | No | We verify our algorithm s performance in one stable and one unstable dynamical system. We set A = 0.8 0.1 0 0.8 and B = 0 1 , with system noise σ = 1, injected noise baseline τ = 1, Q = I2, R = 1 and initial state x0 = [0, 0] . (This describes a simulated system, not an external public dataset.) |
| Dataset Splits | No | All stable system results are based on 1,000 independent runs of Algorithm 1 for T = 10, 000 time steps. (The paper describes simulation runs, not data splits for training, validation, or testing from an external dataset.) |
| Hardware Specification | No | The paper does not provide any specific hardware details for running the experiments. |
| Software Dependencies | No | The paper does not specify any particular software libraries, packages, or solvers with version numbers. |
| Experiment Setup | Yes | I.1 Experiment Setting I.1.1 Experiment Setting on Stable System We set A = 0.8 0.1 0 0.8 and B = 0 1 , with system noise σ = 1, injected noise baseline τ = 1, Q = I2, R = 1 and initial state x0 = [0, 0] . As for the algorithmic hyper-parameters, we set the warning threshold for states xt at Cx = 1 (so that Cx,t = log(t)), the known stable controller K0 = [0, 0], and the upper bound of the L2-norm for our controller ˆKt at CK = 5. Note that this is conservative by about a factor of 10, since the true optimal controller in this system is K 0.10, 0.48 . |