Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Reinforcement Learning in Continuous Time and Space: A Stochastic Control Approach
Authors: Haoran Wang, Thaleia Zariphopoulou, Xun Yu Zhou
JMLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | The objective of this paper is not to develop any new, efficient RL algorithm (like most existing works do) but, rather, to propose and provide a theoretical framework that of stochastic control for studying RL problems in continuous time and space. We carry out a complete analysis of the problem in the linear quadratic (LQ) setting and deduce that the optimal feedback control distribution for balancing exploitation and exploration is Gaussian. |
| Researcher Affiliation | Collaboration | Haoran Wang EMAIL CAI Data Science and Machine Learning The Vanguard Group, Inc. Malvern, PA 19355, USA; Thaleia Zariphopoulou EMAIL Department of Mathematics and IROM The University of Texas at Austin Austin, TX 78712, USA Oxford-Man Institute University of Oxford Oxford, UK; Xun Yu Zhou EMAIL Department of Industrial Engineering and Operations Research The Data Science Institute Columbia University New York, NY 10027, USA |
| Pseudocode | No | The paper focuses on mathematical derivations and theoretical framework for continuous-time reinforcement learning. There are no explicit pseudocode blocks or algorithms presented in a structured format. |
| Open Source Code | No | The paper does not provide explicit statements or links for the availability of source code for the methodology described. It refers to follow-up work and other papers by the authors but not code for this specific research. |
| Open Datasets | No | The paper presents a theoretical framework and analytical solutions for reinforcement learning in continuous time and space. It does not involve experimental evaluation using datasets, nor does it mention any datasets or their availability. |
| Dataset Splits | No | The paper is theoretical and does not involve experimental evaluation on datasets, thus no dataset split information is provided. |
| Hardware Specification | No | The paper describes a theoretical framework and analytical results for reinforcement learning. It does not include any experimental results that would require hardware specifications. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical derivations. It does not mention any specific software dependencies or version numbers. |
| Experiment Setup | No | The paper is entirely theoretical, developing a stochastic control approach to reinforcement learning. As such, it does not detail any experimental setup, hyperparameters, or training configurations. |