reproducibilityindex.ai

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

Authors: Max Simchowitz, Kevin G. Jamieson

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper establishes that optimistic algorithms attain gap-dependent and nonasymptotic logarithmic regret for episodic MDPs. [...] The key technique in our analysis is a novel clipped regret decomposition which applies to a broad family of recent optimistic algorithms for episodic MDPs.
Researcher Affiliation	Academia	Max Simchowitz UC Berkeley msimchow@berkeley.edu Kevin Jamieson University of Washington jamieson@cs.washington.edu
Pseudocode	Yes	Strong Euler is based on carefully selected bonuses from [14], and formally instantiated in Algorithm 1 in Appendix E.
Open Source Code	No	The paper does not mention the availability of open-source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets	No	The paper is theoretical and focuses on regret bounds for MDPs. It does not conduct empirical studies with datasets, thus no dataset is mentioned as publicly available for training.
Dataset Splits	No	The paper is theoretical and does not describe experiments that would require dataset splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe empirical experiments, therefore, no hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and does not describe an implementation or empirical experiments that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and focuses on mathematical analysis of algorithms. It does not describe an experimental setup with hyperparameters or system-level training settings.