Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Reinforcement Learning in Continuous Time and Space: A Stochastic Control Approach

Authors: Haoran Wang, Thaleia Zariphopoulou, Xun Yu Zhou

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical The objective of this paper is not to develop any new, efficient RL algorithm (like most existing works do) but, rather, to propose and provide a theoretical framework that of stochastic control for studying RL problems in continuous time and space. We carry out a complete analysis of the problem in the linear quadratic (LQ) setting and deduce that the optimal feedback control distribution for balancing exploitation and exploration is Gaussian.
Researcher Affiliation Collaboration Haoran Wang EMAIL CAI Data Science and Machine Learning The Vanguard Group, Inc. Malvern, PA 19355, USA; Thaleia Zariphopoulou EMAIL Department of Mathematics and IROM The University of Texas at Austin Austin, TX 78712, USA Oxford-Man Institute University of Oxford Oxford, UK; Xun Yu Zhou EMAIL Department of Industrial Engineering and Operations Research The Data Science Institute Columbia University New York, NY 10027, USA
Pseudocode No The paper focuses on mathematical derivations and theoretical framework for continuous-time reinforcement learning. There are no explicit pseudocode blocks or algorithms presented in a structured format.
Open Source Code No The paper does not provide explicit statements or links for the availability of source code for the methodology described. It refers to follow-up work and other papers by the authors but not code for this specific research.
Open Datasets No The paper presents a theoretical framework and analytical solutions for reinforcement learning in continuous time and space. It does not involve experimental evaluation using datasets, nor does it mention any datasets or their availability.
Dataset Splits No The paper is theoretical and does not involve experimental evaluation on datasets, thus no dataset split information is provided.
Hardware Specification No The paper describes a theoretical framework and analytical results for reinforcement learning. It does not include any experimental results that would require hardware specifications.
Software Dependencies No The paper is theoretical and focuses on mathematical derivations. It does not mention any specific software dependencies or version numbers.
Experiment Setup No The paper is entirely theoretical, developing a stochastic control approach to reinforcement learning. As such, it does not detail any experimental setup, hyperparameters, or training configurations.