Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

mlr3pipelines - Flexible Machine Learning Pipelines in R

Authors: Martin Binder, Florian Pfisterer, Michel Lang, Lennart Schneider, Lars Kotthoff, Bernd Bischl

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We present mlr3pipelines, an R framework which can be used to deﬁne linear and complex non-linear ML workﬂows as directed acyclic graphs. The paper describes the design, functionality, and examples of this software framework rather than presenting empirical results from specific machine learning experiments.
Researcher Affiliation	Academia	All authors are affiliated with academic institutions: '1 Department of Statistics, LMU Munich, Germany' and '2 Department of Computer Science, University of Wyoming, USA'. The email addresses also confirm academic affiliations (e.g., EMAIL, EMAIL).
Pseudocode	No	The paper includes 'Listing 1', which is an example of R code demonstrating how to construct a branching pipeline using mlr3pipelines, but it does not contain any formal pseudocode or algorithm blocks describing a general method or procedure.
Open Source Code	Yes	All packages of the mlr3 ecosystem are released under LGPL-3 on Git Hub (https:// github.com/mlr-org) and on CRAN.
Open Datasets	No	The paper describes a software framework and its functionality, but it does not present experiments that use specific datasets. Therefore, no information about open datasets or their access is provided.
Dataset Splits	No	The paper describes a software framework and its functionality, but does not present experiments that involve specific datasets or their partitioning. Therefore, no information about dataset splits is provided.
Hardware Specification	No	The paper describes a software framework. It does not report on experimental results or specify any hardware used for running experiments.
Software Dependencies	Yes	As one of the most popular and widely-used software systems for statistics and ML, R (R Core Team, 2020) has several packages that provide a standardized interface for predictive modeling, such as caret (Kuhn, 2008), tidymodels (Kuhn and Wickham, 2020b), mlr (Bischl et al., 2016), and its successor mlr3 (Lang et al., 2019).
Experiment Setup	No	The paper describes a software framework and its capabilities, including hyperparameter tuning, but it does not present any specific experiments with concrete hyperparameter values or training configurations within this paper.