Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling

Authors: Adrian Šošić, Elmar Rueckert, Jan Peters, Abdelhak M. Zoubir, Heinz Koeppl

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experimental study, we compare the proposed approach with common baseline methods on a variety of benchmark tasks and real-world scenarios. The results reveal that our approach performs signiﬁcantly better than the original BNIRL model and alternative IRL solutions on all considered tasks. Interestingly enough, our algorithm outperforms the baselines even when the expert s true reward structure is dense and the underlying subgoal assumption is violated.
Researcher Affiliation	Academia	Adrian ˇSoˇsi c EMAIL Abdelhak M. Zoubir EMAIL Signal Processing Group Technische Universit at Darmstadt 64283 Darmstadt, Germany Elmar Rueckert EMAIL Institute for Robotics and Cognitive Systems University of L ubeck 23538 L ubeck, Germany Jan Peters EMAIL Autonomous Systems Labs Technische Universit at Darmstadt 64289 Darmstadt, Germany Heinz Koeppl EMAIL Bioinspired Communication Systems Technische Universit at Darmstadt 64283 Darmstadt, Germany
Pseudocode	No	The paper describes methods like Gibbs sampling and conditional probability distributions, but does not present any explicit pseudocode blocks or algorithms with numbered steps.
Open Source Code	No	The paper states: "Videos of all demonstrated tasks can be found at http://www.spg.tu-darmstadt.de/jmlr2018." This link provides access to videos of tasks, not the source code for the methodology described in the paper.
Open Datasets	No	The paper mentions generating random MDPs, using a "BNIRL data set (Michini and How, 2012)" as a reference, and collecting data on a "KUKA lightweight robotic arm." However, it does not provide concrete access information (links, DOIs, repositories) for these datasets to be publicly available or open for replication.
Dataset Splits	No	The paper mentions generating "a number of expert trajectories of length 10" for the random MDP scenario and a "manual segmentation of all recorded trajectories" for the robot experiment. However, it does not specify any training/test/validation dataset splits (e.g., percentages, sample counts, or references to standard predefined splits) needed for reproducibility.
Hardware Specification	No	The paper mentions using a "KUKA lightweight robotic arm" for data collection in the robot experiment. However, it does not specify any hardware details (e.g., CPU, GPU models, memory, or specific computer specifications) used to run the computational experiments or train the models.
Software Dependencies	No	The paper does not provide specific software names with version numbers. It mentions concepts like "Gibbs chain" and "value iteration algorithm" but not the software used for their implementation with version details.
Experiment Setup	No	The paper discusses various model parameters (e.g., discount factor γ = 0.9, uncertainty coefficient β, self-link parameter ν, constant κ for the score function) but does not provide a comprehensive set of specific hyperparameters or system-level training settings in a clearly labeled section or table that would allow for full reproduction of experiments.