Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

Authors: Rolf A. N. Starre, Marco Loog, Elena Congeduti, Frans A Oliehoek

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our theoretical analysis shows that abstraction can introduce a dependence between samples collected online (e.g., in the real world). That means that, without taking this dependence into account, results for MBRL do not directly extend to this setting. Our result shows that we can use concentration inequalities for martingales to overcome this problem. This result makes it possible to extend the guarantees of existing MBRL algorithms to the setting with abstraction. We illustrate this by combining R-MAX, a prototypical MBRL algorithm, with abstraction, thus producing the first performance guarantees for model-based RL from Abstracted Observations : model-based reinforcement learning with an abstract model.
Researcher Affiliation	Academia	Rolf A. N. Starre EMAIL Delft University of Technology Marco Loog EMAIL Radboud University Elena Congeduti EMAIL Delft University of Technology Frans A. Oliehoek EMAIL Delft University of Technology
Pseudocode	Yes	Algorithm 1: Procedure: R-MAX from Abstracted Observations Algorithm 2: Procedure: MBRLAO Algorithm 3: Collect Samples with Simulator
Open Source Code	No	The paper does not provide any concrete access to source code (no specific repository link, explicit code release statement, or code in supplementary materials).
Open Datasets	No	The paper focuses on theoretical analysis and providing performance guarantees for an algorithm (R-MAX) in a specific setting. It does not describe or evaluate any empirical experiments using a specific dataset, therefore no information about open datasets is provided.
Dataset Splits	No	The paper presents theoretical analysis and algorithm guarantees. It does not describe any empirical experiments involving dataset splits (training, validation, test).
Hardware Specification	No	The paper provides theoretical analysis and algorithm guarantees, rather than empirical results from experiments. Therefore, it does not specify any hardware used.
Software Dependencies	No	The paper focuses on theoretical analysis and algorithm guarantees, not on experimental implementation details. Therefore, it does not list any specific software dependencies with version numbers.
Experiment Setup	No	The paper presents a theoretical analysis and provides performance guarantees for a model-based reinforcement learning algorithm. It does not describe any specific empirical experiments or their setup, including hyperparameters or training configurations.