Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
Authors: Asaf Cassel, Alon Cohen, Tomer Koren
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with the number of steps in two scenarios: when only the state transition matrix A is unknown, and when only the stateaction transition matrix B is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound that shows that when the latter condition is violated, square root regret is unavoidable. |
| Researcher Affiliation | Collaboration | 1School of Computer Science, Tel Aviv University 2Google Research, Tel Aviv. |
| Pseudocode | Yes | Algorithm 1 |
| Open Source Code | No | The paper does not provide a statement about releasing source code or include a link to a code repository. |
| Open Datasets | No | This is a theoretical paper. No datasets, public or otherwise, are mentioned or used for empirical evaluation. |
| Dataset Splits | No | This is a theoretical paper. No dataset split information (training, validation, or testing) is provided as no empirical experiments are conducted. |
| Hardware Specification | No | This is a theoretical paper. No specific hardware specifications are mentioned as no empirical experiments are reported. |
| Software Dependencies | No | This is a theoretical paper. No specific software dependencies with version numbers are mentioned as no empirical experiments are reported. |
| Experiment Setup | No | This is a theoretical paper. No specific experimental setup details, hyperparameters, or training configurations are provided as no empirical experiments are conducted. |