reproducibilityindex.ai

Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems

Authors: Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we present the ﬁrst model estimation method with ﬁnite-time guarantees in both open and closed-loop system identiﬁcation. Deploying this estimation method, we propose adaptive control online learning (ADAPTON), an efﬁcient reinforcement learning algorithm that adaptively learns the system dynamics and continuously updates its controller through online learning steps. [...] We show that ADAPTON achieves a regret upper bound of polylog (T), after T time steps of agent-environment interaction. To the best of our knowledge, ADAPTON is the ﬁrst algorithm that achieves polylog (T) regret in adaptive control of unknown partially observable linear dynamical systems which includes linear quadratic Gaussian (LQG) control.
Researcher Affiliation	Academia	Sahin Lale Caltech alale@caltech.edu Kamyar Azizzadenesheli Purdue University kamyar@purdue.edu Babak Hassibi Caltech hassibi@caltech.edu Anima Anandkumar Caltech anima@caltech.edu
Pseudocode	Yes	ADAPTON is illustrated in Figure 1 and the detailed pseudo-code is provided in Appendix C.
Open Source Code	No	The paper does not provide any statement or link regarding the public availability of its source code.
Open Datasets	No	The paper is theoretical and focuses on algorithm design and regret analysis in partially observable linear dynamical systems, rather than empirical evaluation on a specific dataset. Therefore, it does not mention a publicly available dataset for training.
Dataset Splits	No	The paper is theoretical and focuses on algorithm design and regret analysis, not empirical validation on a dataset with splits.
Hardware Specification	No	The paper does not provide any specific hardware details used for running experiments. It focuses on theoretical analysis and algorithm development.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers, focusing on theoretical contributions.
Experiment Setup	No	The paper describes an algorithm and its theoretical properties but does not provide specific experimental setup details like hyperparameter values or training configurations, as it is a theoretical work.