Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations
Authors: Yiheng Lin, James A. Preiss, Emile Anand, Yingying Li, Yisong Yue, Adam Wierman
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical experiments show that GAPS can adapt to changing environments more quickly than existing benchmarks. In numerical experiments, we demonstrate that GAPS can adapt faster than an existing follow-the-leader-type baseline in MPC with imperfect disturbance predictions, and outperforms a strong optimal control baseline in a nonlinear system with non-i.i.d. disturbances. |
| Researcher Affiliation | Academia | Yiheng Lin California Institute of Technology Pasadena, CA, USA yihengl@caltech.edu James A. Preiss California Institute of Technology Pasadena, CA, USA japreiss@caltech.edu Emile Anand California Institute of Technology Pasadena, CA, USA eanand@caltech.edu Yingying Li University of Illinois Urbana-Champaign Urbana, IL, USA yl101@illinois.edu Yisong Yue California Institute of Technology Pasadena, CA, USA yyue@caltech.edu Adam Wierman California Institute of Technology Pasadena, CA, USA adamw@caltech.edu |
| Pseudocode | Yes | Algorithm 1 Gradient-based Adaptive Policy Selection (GAPS) |
| Open Source Code | Yes | The source code for all experiments is published at https://www.github.com/jpreiss/adaptive_policy_selection. |
| Open Datasets | No | The paper describes simulated systems and their parameters (e.g., 'scalar system xt+1 = 2xt + ut + wt with the cost ft(xt, ut) = x2t', 'nonlinear inverted pendulum system') but does not specify or provide access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper conducts numerical simulations over a time horizon but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory, or specific computer specifications) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers. |
| Experiment Setup | Yes | Details are deferred to Appendix I due to space limitations. Appendix I also includes a third experiment comparing GAPS to a bandit-based algorithm for selecting the planning horizon in MPC, and a computation time comparison between GAPS and the alternative gradient approximation of [1]. [...] We set T = 400, η = 0.05, B = 20. |