reproducibilityindex.ai

A Unifying View of Optimism in Episodic Reinforcement Learning

Authors: Gergely Neu, Ciara Pike-Burke

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper we provide a general framework for designing, analyzing and implementing such algorithms in the episodic reinforcement learning problem. This framework is built upon Lagrangian duality, and demonstrates that every model-optimistic algorithm that constructs an optimistic MDP has an equivalent representation as a value-optimistic dynamic programming algorithm. ... we show that it is possible to get the best of both worlds by providing a class of algorithms which have a computationally efﬁcient dynamic-programming implementation and also a simple probabilistic analysis.
Researcher Affiliation	Academia	Gergely Neu Universitat Pompeu Fabra Barcelona, Spain gergely.neu@gmail.com Ciara Pike-Burke Imperial College London London, UK c.pikeburke@gmail.com
Pseudocode	No	The paper describes mathematical equations and algorithms conceptually but does not provide a formal pseudocode block or algorithm listing.
Open Source Code	No	The paper does not contain any statement about releasing open-source code or a link to a code repository.
Open Datasets	No	The paper is theoretical and does not describe experiments that use datasets; thus, there is no mention of dataset availability for training.
Dataset Splits	No	The paper is theoretical and does not describe experiments that use datasets; thus, there is no information about training/validation/test splits.
Hardware Specification	No	The paper is theoretical and does not describe any experimental setup or the hardware used.
Software Dependencies	No	The paper is theoretical and does not describe any experimental setup or software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any experimental setup, hyperparameters, or training configurations.