reproducibilityindex.ai

Experiment Planning with Function Approximation

Authors: Aldo Pacchiano, Jonathan Lee, Emma Brunskill

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work we propose two experiment planning strategies compatible with function approximation. The ﬁrst is an eluder planning and sampling procedure that can recover optimality guarantees depending on the eluder dimension [42] of the reward function class. For the second, we show that a uniform sampler achieves competitive optimality rates in the setting where the number of actions is small. We ﬁnalize our results introducing a statistical gap ﬂeshing out the fundamental differences between planning and adaptive learning and provide results for planning with model selection.
Researcher Affiliation	Academia	Aldo Pacchiano Broad Institute & Boston University apacchia@broadinstitute.org Jonathan N. Lee Stanford University jnl@stanford.edu Emma Brunskill Stanford University ebrun@cs.stanford.edu
Pseudocode	Yes	Algorithm 1 Eluder Planner and Algorithm 2 Sampler
Open Source Code	No	The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available.
Open Datasets	No	The paper is theoretical and does not conduct empirical experiments using specific datasets. While it refers to "m T i.i.d. ofﬂine context samples" as part of its theoretical problem definition, it does not provide concrete access information (link, DOI, citation) for any publicly available or open dataset used for empirical training.
Dataset Splits	No	The paper is theoretical and does not describe any empirical experiments. Therefore, it does not provide specific dataset split information (e.g., percentages or sample counts for training, validation, and testing) needed to reproduce data partitioning for empirical evaluation.
Hardware Specification	No	The paper is theoretical and does not conduct empirical experiments. Therefore, it does not provide specific hardware details (like GPU/CPU models or memory amounts) used for running experiments.
Software Dependencies	No	The paper is theoretical and does not conduct empirical experiments. Therefore, it does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate experiments.
Experiment Setup	No	The paper is theoretical and does not conduct empirical experiments. Therefore, it does not contain specific experimental setup details such as hyperparameter values, training configurations, or system-level settings.