reproducibilityindex.ai

Provably adaptive reinforcement learning in metric spaces

Authors: Tongyi Cao, Akshay Krishnamurthy

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We provide a reﬁned analysis of the algorithm of Sinclair, Banerjee, and Yu (2019) and show that its regret scales with the zooming dimension of the instance. This parameter, which originates in the bandit literature, captures the size of the subsets of near optimal actions and is always smaller than the covering dimension used in previous analyses. As such, our results are the ﬁrst provably adaptive guarantees for reinforcement learning in metric spaces.
Researcher Affiliation	Collaboration	Tongyi Cao University of Massachusetts Amherst tcao@cs.umass.edu Akshay Krishnamurthy Microsoft Research NYC akshaykr@microsoft.com
Pseudocode	Yes	Algorithm 1 Adaptive Q learning
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described.
Open Datasets	No	The paper describes a theoretical analysis of an algorithm in a reinforcement learning setting but does not mention the use of any specific, publicly available datasets for training or experimentation.
Dataset Splits	No	The paper describes a theoretical analysis and does not involve experimental validation with dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe any experimental setup, thus no hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and does not mention specific software dependencies with version numbers for reproducibility.
Experiment Setup	No	The paper describes a theoretical analysis and does not include details about an experimental setup, such as hyperparameters or system-level training settings.