reproducibilityindex.ai

Model Selection in Batch Policy Optimization

Authors: Jonathan Lee, George Tucker, Ofir Nachum, Bo Dai

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conclude with experiments demonstrating the efﬁcacy of these algorithms.
Researcher Affiliation	Collaboration	1Department of Computer Science, Stanford University, USA 2Google Research, Mountain View, USA.
Pseudocode	Yes	Algorithm 1 Pessimistic Linear Learner; Algorithm 2 Complexity-Coverage Selection; Algorithm 3 SLOPE Method
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	No	To complement our primarily theoretical results, we study the utility of the above model selection algorithms in synthetic experiments and empirically compare them... For both the batch dataset and the test set, noise was artiﬁcially generated on rewards by sampling from a standard normal distribution N(0, 1).
Dataset Splits	No	The paper mentions generating a 'batch dataset' and a 'test set' but does not specify a separate validation split or the percentages of data used for training, validation, and testing.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions that random quantities were generated by sampling multivariate normal distributions, but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific libraries).
Experiment Setup	Yes	For the algorithms, penalization terms (i.e. the estimation error) typically depends on constants being chosen sufﬁciently large to ensure a conﬁdence interval is valid. However, choosing large values in practice can lead to unnecessarily poor convergence. We found that multiplying by C = 0.1 yielded good performance in most settings.