reproducibilityindex.ai

Online Control of Unknown Time-Varying Dynamical Systems

Authors: Edgar Minasyan, Paula Gradu, Max Simchowitz, Elad Hazan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study online control of time-varying linear systems with unknown dynamics in the nonstochastic control model. At a high level, we demonstrate that this setting is qualitatively harder than that of either unknown time-invariant or known timevarying dynamics, and complement our negative results with algorithmic upper bounds in regimes where sublinear regret is possible. More speciﬁcally, we study regret bounds with respect to common classes of policies: Disturbance Action (SLS), Disturbance Response (Youla), and linear feedback policies. While these three classes are essentially equivalent for LTI systems, we demonstrate that these equivalences break down for time-varying systems. We prove a lower bound that no algorithm can obtain sublinear regret with respect to the ﬁrst two classes unless a certain measure of system variability also scales sublinearly in the horizon. Furthermore, we show that ofﬂine planning over the state linear feedback policies is NP-hard, suggesting hardness of the online learning problem. On the positive side, we give an efﬁcient algorithm that attains a sublinear regret bound against the class of Disturbance Response policies up to the aforementioned system variability term.
Researcher Affiliation	Collaboration	Edgar Minasyan Princeton University, Google AI Princeton minasyan@princeton.edu Paula Gradu UC Berkeley pgradu@berkeley.edu Max Simchowitz MIT msimchow@mit.edu Elad Hazan Princeton University, Google AI Princeton ehazan@princeton.edu
Pseudocode	Yes	Algorithm 1 Adaptive Estimation Algorithm (ADA-PRED)
Open Source Code	No	No statement or link indicating that source code for the methodology described in the paper is publicly available.
Open Datasets	No	The paper is theoretical and does not mention the use of specific public datasets for training. There is no information about dataset availability.
Dataset Splits	No	The paper is theoretical and does not describe experiments with dataset splits. No specific dataset split information for validation is provided.
Hardware Specification	No	The paper is theoretical and does not report on computational experiments. No specific hardware details are provided.
Software Dependencies	No	The paper is theoretical and does not describe specific software implementations or dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not report on computational experiments. No specific experimental setup details (like hyperparameters or training configurations) are provided.