Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Dynamic Planning and Learning under Recovering Rewards
Authors: David Simchi-Levi, Zeyu Zheng, Feng Zhu
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | we propose, construct and prove performance guarantees for a class of Purely Periodic Policies . For the online problem when the model parameters are unknown and need to be learned, we design an Upper Confidence Bound (UCB) based policy that approximately has e O(NT) regret against the offline benchmark. Our framework and policy design may have the potential to be adapted into other offline planning and online learning applications with non-stationary and recovering rewards. Also, we would also like to conduct experiments to see the practical performance of our policies for various application needs. |
| Researcher Affiliation | Academia | 1Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Massachusetts, USA 2Department of Industrial Engineering and Operations Research, University of California, Berkeley, USA. |
| Pseudocode | Yes | Algorithm 1 Offline Purely Periodic Planning |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments with specific datasets, thus it does not mention whether any datasets are publicly available. |
| Dataset Splits | No | The paper is theoretical and does not conduct empirical experiments with datasets, so it does not discuss dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not report on empirical experiments, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe implementation details or report on empirical experiments, therefore no specific software dependencies with version numbers are mentioned. |
| Experiment Setup | No | The paper is theoretical and does not describe any empirical experiments or their setup, including hyperparameters or training configurations. |