reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Reinforcement Learning in Continuous Time and Space: A Stochastic Control Approach

Authors: Haoran Wang, Thaleia Zariphopoulou, Xun Yu Zhou

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	The objective of this paper is not to develop any new, efficient RL algorithm (like most existing works do) but, rather, to propose and provide a theoretical framework that of stochastic control for studying RL problems in continuous time and space. We carry out a complete analysis of the problem in the linear quadratic (LQ) setting and deduce that the optimal feedback control distribution for balancing exploitation and exploration is Gaussian.
Researcher Affiliation	Collaboration	Haoran Wang EMAIL CAI Data Science and Machine Learning The Vanguard Group, Inc. Malvern, PA 19355, USA; Thaleia Zariphopoulou EMAIL Department of Mathematics and IROM The University of Texas at Austin Austin, TX 78712, USA Oxford-Man Institute University of Oxford Oxford, UK; Xun Yu Zhou EMAIL Department of Industrial Engineering and Operations Research The Data Science Institute Columbia University New York, NY 10027, USA
Pseudocode	No	The paper focuses on mathematical derivations and theoretical framework for continuous-time reinforcement learning. There are no explicit pseudocode blocks or algorithms presented in a structured format.
Open Source Code	No	The paper does not provide explicit statements or links for the availability of source code for the methodology described. It refers to follow-up work and other papers by the authors but not code for this specific research.
Open Datasets	No	The paper presents a theoretical framework and analytical solutions for reinforcement learning in continuous time and space. It does not involve experimental evaluation using datasets, nor does it mention any datasets or their availability.
Dataset Splits	No	The paper is theoretical and does not involve experimental evaluation on datasets, thus no dataset split information is provided.
Hardware Specification	No	The paper describes a theoretical framework and analytical results for reinforcement learning. It does not include any experimental results that would require hardware specifications.
Software Dependencies	No	The paper is theoretical and focuses on mathematical derivations. It does not mention any specific software dependencies or version numbers.
Experiment Setup	No	The paper is entirely theoretical, developing a stochastic control approach to reinforcement learning. As such, it does not detail any experimental setup, hyperparameters, or training configurations.