Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations
Authors: Yiheng Lin, James A. Preiss, Emile Anand, Yingying Li, Yisong Yue, Adam Wierman
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our numerical experiments show that GAPS can adapt to changing environments more quickly than existing benchmarks. In numerical experiments, we demonstrate that GAPS can adapt faster than an existing follow-the-leader-type baseline in MPC with imperfect disturbance predictions, and outperforms a strong optimal control baseline in a nonlinear system with non-i.i.d. disturbances. |
| Researcher Affiliation | Academia | Yiheng Lin California Institute of Technology Pasadena, CA, USA EMAIL James A. Preiss California Institute of Technology Pasadena, CA, USA EMAIL Emile Anand California Institute of Technology Pasadena, CA, USA EMAIL Yingying Li University of Illinois Urbana-Champaign Urbana, IL, USA EMAIL Yisong Yue California Institute of Technology Pasadena, CA, USA EMAIL Adam Wierman California Institute of Technology Pasadena, CA, USA EMAIL |
| Pseudocode | Yes | Algorithm 1 Gradient-based Adaptive Policy Selection (GAPS) |
| Open Source Code | Yes | The source code for all experiments is published at https://www.github.com/jpreiss/adaptive_policy_selection. |
| Open Datasets | No | The paper describes simulated systems and their parameters (e.g., 'scalar system xt+1 = 2xt + ut + wt with the cost ft(xt, ut) = x2t', 'nonlinear inverted pendulum system') but does not specify or provide access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper conducts numerical simulations over a time horizon but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory, or specific computer specifications) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers. |
| Experiment Setup | Yes | Details are deferred to Appendix I due to space limitations. Appendix I also includes a third experiment comparing GAPS to a bandit-based algorithm for selecting the planning horizon in MPC, and a computation time comparison between GAPS and the alternative gradient approximation of [1]. [...] We set T = 400, η = 0.05, B = 20. |