Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations

Authors: Yiheng Lin, James A. Preiss, Emile Anand, Yingying Li, Yisong Yue, Adam Wierman

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our numerical experiments show that GAPS can adapt to changing environments more quickly than existing benchmarks. In numerical experiments, we demonstrate that GAPS can adapt faster than an existing follow-the-leader-type baseline in MPC with imperfect disturbance predictions, and outperforms a strong optimal control baseline in a nonlinear system with non-i.i.d. disturbances.
Researcher Affiliation Academia Yiheng Lin California Institute of Technology Pasadena, CA, USA yihengl@caltech.edu James A. Preiss California Institute of Technology Pasadena, CA, USA japreiss@caltech.edu Emile Anand California Institute of Technology Pasadena, CA, USA eanand@caltech.edu Yingying Li University of Illinois Urbana-Champaign Urbana, IL, USA yl101@illinois.edu Yisong Yue California Institute of Technology Pasadena, CA, USA yyue@caltech.edu Adam Wierman California Institute of Technology Pasadena, CA, USA adamw@caltech.edu
Pseudocode Yes Algorithm 1 Gradient-based Adaptive Policy Selection (GAPS)
Open Source Code Yes The source code for all experiments is published at https://www.github.com/jpreiss/adaptive_policy_selection.
Open Datasets No The paper describes simulated systems and their parameters (e.g., 'scalar system xt+1 = 2xt + ut + wt with the cost ft(xt, ut) = x2t', 'nonlinear inverted pendulum system') but does not specify or provide access information for a publicly available or open dataset.
Dataset Splits No The paper conducts numerical simulations over a time horizon but does not specify any training, validation, or test dataset splits.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory, or specific computer specifications) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependency details, such as library names with version numbers.
Experiment Setup Yes Details are deferred to Appendix I due to space limitations. Appendix I also includes a third experiment comparing GAPS to a bandit-based algorithm for selecting the planning horizon in MPC, and a computation time comparison between GAPS and the alternative gradient approximation of [1]. [...] We set T = 400, η = 0.05, B = 20.