Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
mlr3pipelines - Flexible Machine Learning Pipelines in R
Authors: Martin Binder, Florian Pfisterer, Michel Lang, Lennart Schneider, Lars Kotthoff, Bernd Bischl
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We present mlr3pipelines, an R framework which can be used to define linear and complex non-linear ML workflows as directed acyclic graphs. The paper describes the design, functionality, and examples of this software framework rather than presenting empirical results from specific machine learning experiments. |
| Researcher Affiliation | Academia | All authors are affiliated with academic institutions: '1 Department of Statistics, LMU Munich, Germany' and '2 Department of Computer Science, University of Wyoming, USA'. The email addresses also confirm academic affiliations (e.g., EMAIL, EMAIL). |
| Pseudocode | No | The paper includes 'Listing 1', which is an example of R code demonstrating how to construct a branching pipeline using mlr3pipelines, but it does not contain any formal pseudocode or algorithm blocks describing a general method or procedure. |
| Open Source Code | Yes | All packages of the mlr3 ecosystem are released under LGPL-3 on Git Hub (https:// github.com/mlr-org) and on CRAN. |
| Open Datasets | No | The paper describes a software framework and its functionality, but it does not present experiments that use specific datasets. Therefore, no information about open datasets or their access is provided. |
| Dataset Splits | No | The paper describes a software framework and its functionality, but does not present experiments that involve specific datasets or their partitioning. Therefore, no information about dataset splits is provided. |
| Hardware Specification | No | The paper describes a software framework. It does not report on experimental results or specify any hardware used for running experiments. |
| Software Dependencies | Yes | As one of the most popular and widely-used software systems for statistics and ML, R (R Core Team, 2020) has several packages that provide a standardized interface for predictive modeling, such as caret (Kuhn, 2008), tidymodels (Kuhn and Wickham, 2020b), mlr (Bischl et al., 2016), and its successor mlr3 (Lang et al., 2019). |
| Experiment Setup | No | The paper describes a software framework and its capabilities, including hyperparameter tuning, but it does not present any specific experiments with concrete hyperparameter values or training configurations within this paper. |