Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
mlr3pipelines - Flexible Machine Learning Pipelines in R
Authors: Martin Binder, Florian Pfisterer, Michel Lang, Lennart Schneider, Lars Kotthoff, Bernd Bischl
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We present mlr3pipelines, an R framework which can be used to define linear and complex non-linear ML workflows as directed acyclic graphs. The paper describes the design, functionality, and examples of this software framework rather than presenting empirical results from specific machine learning experiments. |
| Researcher Affiliation | Academia | All authors are affiliated with academic institutions: '1 Department of Statistics, LMU Munich, Germany' and '2 Department of Computer Science, University of Wyoming, USA'. The email addresses also confirm academic affiliations (e.g., EMAIL, EMAIL). |
| Pseudocode | No | The paper includes 'Listing 1', which is an example of R code demonstrating how to construct a branching pipeline using mlr3pipelines, but it does not contain any formal pseudocode or algorithm blocks describing a general method or procedure. |
| Open Source Code | Yes | All packages of the mlr3 ecosystem are released under LGPL-3 on Git Hub (https:// github.com/mlr-org) and on CRAN. |
| Open Datasets | No | The paper describes a software framework and its functionality, but it does not present experiments that use specific datasets. Therefore, no information about open datasets or their access is provided. |
| Dataset Splits | No | The paper describes a software framework and its functionality, but does not present experiments that involve specific datasets or their partitioning. Therefore, no information about dataset splits is provided. |
| Hardware Specification | No | The paper describes a software framework. It does not report on experimental results or specify any hardware used for running experiments. |
| Software Dependencies | Yes | As one of the most popular and widely-used software systems for statistics and ML, R (R Core Team, 2020) has several packages that provide a standardized interface for predictive modeling, such as caret (Kuhn, 2008), tidymodels (Kuhn and Wickham, 2020b), mlr (Bischl et al., 2016), and its successor mlr3 (Lang et al., 2019). |
| Experiment Setup | No | The paper describes a software framework and its capabilities, including hyperparameter tuning, but it does not present any specific experiments with concrete hyperparameter values or training configurations within this paper. |