Learning the Linear Quadratic Regulator from Nonlinear Observations
Authors: Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our results constitute the first provable sample complexity guarantee for continuous control with an unknown nonlinearity in the system model. To our knowledge, this is the first polynomial-in-dimension sample complexity guarantee for continuous control with an unknown system nonlinearity and general function classes. |
| Researcher Affiliation | Collaboration | Zakaria Mhammedi ANU and Data61 zak.mhammedi@anu.edu.au Dylan J. Foster MIT dylanf@mit.edu Max Simchowitz UC Berkeley msimchow@berkeley.edu Dipendra Misra Microsoft Research NYC dimisra@microsoft.com Wen Sun Microsoft Research NYC sun.wen@microsoft.com Akshay Krishnamurthy Microsoft Research NYC akshaykr@microsoft.com Alexander Rakhlin MIT rakhlin@mit.edu John Langford Microsoft Research NYC jcl@microsoft.com |
| Pseudocode | Yes | Algorithm 1 Rich ID-CE |
| Open Source Code | No | The paper does not contain any explicit statements about releasing code or direct links to a code repository. |
| Open Datasets | No | The paper is theoretical and focuses on sample complexity guarantees for an algorithm. It does not mention the use of specific datasets for training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments involving data splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not discuss any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not discuss specific software dependencies with version numbers required for implementation or experiments. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with specific hyperparameters or system-level training settings. |