Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Regret Analysis of Multi-task Representation Learning for Linear-Quadratic Adaptive Control
Authors: Bruce D. Lee, Leonardo F. Toso, Thomas T. Zhang, James Anderson, Nikolai Matni
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Numerical Validation We present numerical results to validate our bounds. In particular, we compare multi-task representation learning approach for the adaptive LQR design (Algorithm 1) over the setting where a single system attempts to learn its dynamics by using its local simulation data and computes a CE controller on top of the estimated model. |
| Researcher Affiliation | Academia | 1Department of Electrical and Systems Engineering, University of Pennsylvania 2Department of Electrical Engineering, Columbia University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Shared-Representation Certainty-Equivalent Control with Continual Exploration Algorithm 2 Least squares: LS(ห!, x1:t+1, u1:t) Algorithm 3 De-bias & Feature Whiten: DFW(ห!, x(1:H) 1:t , u(1:H) 1:t , N) |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | We generate H (A(h) ฯ , B(h) ฯ ), by ๏ฌrst considering a set of nominal cartpole parameters: c(1) p = (0.4, 1.0, 1.0), c(2) p = (1.6, 1.3, 0.3), c(3) p = (1.3, 0.7, 0.65), c(4) p = (0.2, 0.055, 1.36), and c(5) p = (0.2, 0.47, 1.825). We then perturb such parameters with a random scalar within the interval (0, 0.1) to generate different cartpole parameters c(h) p . With the system matrices (A(h) ฯ , B(h) ฯ ) in hands, for all h [H], we generate the disturbance signal as w(h) t N(0, 0.01Id X). |
| Dataset Splits | No | The paper describes generating data for numerical validation and running experiments for a certain number of timesteps and tasks, but it does not specify explicit training/test/validation dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Figure 1: Regret of Algorithm 1 with varying number of tasks H. We consider k๏ฌn = 10 epochs with initial epoch length ฯ1 = 30, an exploratory sequence scaling as ฮต2 k 1 2k , state and controller bounds xb = 25, and Kb = 15, and random !0 with d(!0, !ฯ) 0.99. ... We set the gravity g = 1 and perform the discretization of (9) with step-size 0.25. ... we generate the disturbance signal as w(h) t N(0, 0.01Id X) and set the step-size and number of iterations of Algorithm 3 as ฯ = 0.25, and N = 1000. |