reproducibilityindex.ai

The Gambler's Problem and Beyond

Authors: Baoxiang Wang, Shuai Li, Jiajin Li, Siu On Chan

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We analytically investigate a deceptively simple problem, the Gambler s problem, introduced in the reinforcement learning textbook by Sutton & Barto (2018), on Example 4.3, Chapter 4, page 84. The problem setting is natural and simple enough that little discussion was given in the book apart from an algorithmic solution by value iteration. A close inspection will however show that the problem, as a representative of the entire family of Markov decision processes (MDP), involves a level of complexity and curiosity uncharted in years of reinforcement learning research.
Researcher Affiliation	Academia	Baoxiang Wang Department of Computer Science and Engineering The Chinese University of Hong Kong bxwang@cse.cuhk.edu.hk Shuai Li John Hopcroft Center for Computer Science Shanghai Jiao Tong University shuaili8@sjtu.edu.cn Jiajin Li Department of SEEM The Chinese University of Hong Kong jjli@se.cuhk.edu.hk Siu On Chan Department of Computer Science and Engineering The Chinese University of Hong Kong siuon@cse.cuhk.edu.hk
Pseudocode	No	The paper does not contain pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper does not provide concrete access to source code for its own methodology. It references a third-party open-source implementation for illustrative plots, but not its own work.
Open Datasets	No	The paper is theoretical and focuses on mathematical analysis of a problem, not on training models with datasets. Thus, there is no dataset explicitly mentioned as publicly available for training.
Dataset Splits	No	The paper is theoretical and does not involve empirical experiments with datasets. Therefore, there are no training/validation/test splits to specify.
Hardware Specification	No	The paper is theoretical and does not conduct experiments that would require specific hardware. Therefore, no hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and does not conduct experiments that would require specific software dependencies with version numbers. It refers to an 'open source implementation (Zhang, 2019)' for plots, but this is not their own software dependency for conducting their research methodology.
Experiment Setup	No	The paper is theoretical and does not involve empirical experiments. Therefore, no experimental setup details like hyperparameters or system-level training settings are provided.