Logarithmic Regret for Adversarial Online Control
Authors: Dylan Foster, Max Simchowitz
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our main result is to achieve logarithmic regret for fully adversarial disturbances, provided that costs are known and quadratic. Our algorithm and analysis use a characterization for the optimal offline control law to reduce the online control problem to (delayed) online learning with approximate advantage functions. |
| Researcher Affiliation | Academia | 1Massachusetts Institute of Technology 2UC Berkeley. Correspondence to: Dylan Foster <dylanf@mit.edu>. |
| Pseudocode | Yes | Algorithm 1 Riccatitron; Algorithm 2 Online Newton Step (ONS(ε,η,C,Σ)); Algorithm 3 |
| Open Source Code | No | The paper does not provide any concrete access (link, explicit statement of release for their code) to open-source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not mention or use any datasets. |
| Dataset Splits | No | The paper is theoretical and does not mention any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | No | The paper does not contain specific experimental setup details, hyperparameters, or training configurations. |