reproducibilityindex.ai

Worst-Case Regret Bounds for Exploration via Randomized Value Functions

Authors: Daniel Russo

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper develops a very different proof strategy and provides a worst-case regret bound for RLSVI applied to tabular ﬁnite-horizon MDPs.
Researcher Affiliation	Academia	Daniel Russo Columbia University djr2174@gsb.columbia.edu
Pseudocode	Yes	Algorithm 1: RLSVI for Tabular, Finite Horizon, MDPs
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository.
Open Datasets	No	The paper is theoretical and focuses on tabular finite-horizon MDPs as a problem formulation, rather than using specific publicly available datasets for empirical training.
Dataset Splits	No	The paper is theoretical and does not describe experiments with dataset splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any experiments that would require specific hardware, therefore no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not describe software implementations with specific versioned dependencies.
Experiment Setup	No	The paper is theoretical and focuses on algorithmic analysis and proofs, therefore it does not describe an experimental setup with hyperparameters or training configurations.