Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Regret Analysis of Multi-task Representation Learning for Linear-Quadratic Adaptive Control
Authors: Bruce D. Lee, Leonardo F. Toso, Thomas T. Zhang, James Anderson, Nikolai Matni
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Numerical Validation We present numerical results to validate our bounds. In particular, we compare multi-task representation learning approach for the adaptive LQR design (Algorithm 1) over the setting where a single system attempts to learn its dynamics by using its local simulation data and computes a CE controller on top of the estimated model. |
| Researcher Affiliation | Academia | 1Department of Electrical and Systems Engineering, University of Pennsylvania 2Department of Electrical Engineering, Columbia University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Shared-Representation Certainty-Equivalent Control with Continual Exploration Algorithm 2 Least squares: LS(ห!, x1:t+1, u1:t) Algorithm 3 De-bias & Feature Whiten: DFW(ห!, x(1:H) 1:t , u(1:H) 1:t , N) |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | We generate H (A(h) ฯ , B(h) ฯ ), by ๏ฌrst considering a set of nominal cartpole parameters: c(1) p = (0.4, 1.0, 1.0), c(2) p = (1.6, 1.3, 0.3), c(3) p = (1.3, 0.7, 0.65), c(4) p = (0.2, 0.055, 1.36), and c(5) p = (0.2, 0.47, 1.825). We then perturb such parameters with a random scalar within the interval (0, 0.1) to generate different cartpole parameters c(h) p . With the system matrices (A(h) ฯ , B(h) ฯ ) in hands, for all h [H], we generate the disturbance signal as w(h) t N(0, 0.01Id X). |
| Dataset Splits | No | The paper describes generating data for numerical validation and running experiments for a certain number of timesteps and tasks, but it does not specify explicit training/test/validation dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Figure 1: Regret of Algorithm 1 with varying number of tasks H. We consider k๏ฌn = 10 epochs with initial epoch length ฯ1 = 30, an exploratory sequence scaling as ฮต2 k 1 2k , state and controller bounds xb = 25, and Kb = 15, and random !0 with d(!0, !ฯ) 0.99. ... We set the gravity g = 1 and perform the discretization of (9) with step-size 0.25. ... we generate the disturbance signal as w(h) t N(0, 0.01Id X) and set the step-size and number of iterations of Algorithm 3 as ฯ = 0.25, and N = 1000. |