Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dual Control for Approximate Bayesian Reinforcement Learning

Authors: Edgar D. Klenske, Philipp Hennig

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on simulated systems show that this framework oﬀers a useful approximation to the intractable aspects of Bayesian RL, producing structured exploration strategies that diﬀer from standard RL approaches. We provide simple examples for the use of this framework in (approximate) Gaussian process regression and feedforward neural networks for the control of exploration.
Researcher Affiliation	Academia	Edgar D. Klenske EMAIL Max-Planck-Institute for Intelligent Systems Spemannstraße 38 72076 T ubingen, Germany Philipp Hennig EMAIL Max-Planck-Institute for Intelligent Systems Spemannstraße 38 72076 T ubingen, Germany
Pseudocode	Yes	Figure 2: Flow-chart of the approximate dual control algorithm to show the overall structure. Adapted from Tse and Bar-Shalom (1973). The left cycle is the inner loop, performing the nonlinear optimization.
Open Source Code	No	The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	No	The experiments were conducted on simulated systems, with the dynamics and parameters explicitly defined within the paper (e.g., in Section 6.2 and 6.3), rather than utilizing external, publicly available datasets.
Dataset Splits	No	The paper describes experiments on simulated dynamical systems where the system dynamics and parameters are defined within the text. It does not utilize predefined datasets that would require explicit training, testing, or validation splits.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications.
Software Dependencies	No	The paper does not provide specific details about software dependencies or their version numbers used in the implementation or experimentation.
Experiment Setup	Yes	The paper provides specific experimental setup details, including system parameters and cost function weightings for each simulated experiment (e.g., 'a = 1 (known), b = 2, p(b) = N(b;1,10), Q = 10^-1, R = 0, W = 1, Λ = 1, T = 2' in Section 6.1). It also specifies parameters for the controllers, such as 'τ = 0.1' for BEB in Section 6.1, and the number and type of features for GP (30 features) and neural network models (4 logistic features).