Logarithmic Regret for Online Control

Authors: Naman Agarwal, Elad Hazan, Karan Singh

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We presented two algorithms for controlling linear dynamical systems with strongly convex costs with regret that scales poly-logarithmically with time. This improves state-of-the-art known regret bounds that scale as O(T). It remains open to extend the poly-log regret guarantees to more general systems and loss functions, such as exp-concave losses, or alternatively, show that this is impossible.
Researcher Affiliation Collaboration Naman Agarwal1 Elad Hazan1 2 Karan Singh1 2 1 Google AI Princeton 2 Computer Science, Princeton University namanagarwal@google.com, {ehazan,karans}@princeton.edu
Pseudocode Yes Algorithm 1 Online Control Algorithm
Open Source Code No The paper does not provide any statement or link indicating that the source code for the methodology is openly available.
Open Datasets No The paper is theoretical and does not use or describe a publicly available dataset for experiments. It mentions 'noise wt is a random variable generated independently at every time step' which is a theoretical assumption.
Dataset Splits No The paper is theoretical and does not provide specific details regarding dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not mention any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and does not specify any software dependencies with version numbers for experimental reproducibility.
Experiment Setup No The paper is theoretical and does not provide specific experimental setup details, hyperparameters, or training configurations.