Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MFE: Towards reproducible meta-feature extraction

Authors: Edesio Alcobaça, Felipe Siqueira, Adriano Rivolli, Luís P. F. Garcia, Jefferson T. Oliva, André C. P. L. F. de Carvalho

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Table 1 compares the main characteristics of MFE against existing alternatives. The comparison includes the meta-feature groups available, the number of extracted metafeatures, and whether their extraction is systematic. As can be seen, MFE support six more meta-feature groups than these alternatives: relative landmarking (Rivolli et al., 2018), subsampling landmarking (Soares et al., 2001), clustering-based (Pimentel and de Carvalho, 2019), concept (Rivolli et al., 2018), itemset (Song et al., 2012) and complexity (Lorena et al., 2019). Moreover, the MFE packages oﬀers the most extensive set and follow recent frameworks. Both packages have the same measures and produce similar output for each measure and summarization function, allowing cross-platform executions.
Researcher Affiliation	Academia	Edesio Alcobaça EMAIL Felipe Siqueira EMAIL (...) Institute of Mathematical and Computer Sciences University of São Paulo Av. Trabalhador São-carlense, 400, São Carlos, São Paulo 13560-970, Brazil, Adriano Rivolli EMAIL, Luís P. F. Garcia EMAIL, Jeﬀerson T. Oliva EMAIL, André C. P. L. F. de Carvalho EMAIL. The domains usp.br, utfpr.edu.br, and unb.br indicate academic institutions.
Pseudocode	No	The paper describes the architecture and features of the MFE packages. While it provides a formal definition of meta-feature extraction in Equation 1 and discusses implementation details, it does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	In this paper, we propose two Meta-Feature Extractor (MFE) packages, written in both Python and R, to ﬁll this lack. The packages follow recent frameworks for meta-feature extraction, aiming to facilitate the reproducibility of meta-learning experiments. These packages, available in Python (pymfe2) and R (mfe3), are detailed in this work. 2. https://pypi.org/project/pymfe/ 3. https://cran.r-project.org/package=mfe. The packages are available on Git Hub4,5. 4. https://github.com/ealcobaca/pymfe 5. https://github.com/rivolli/mfe.
Open Datasets	No	The paper discusses meta-learning in the context of general 'data sets' and 'meta-data sets' but does not specify any particular dataset used in experiments within this paper, nor does it provide links or citations to any specific publicly available datasets.
Dataset Splits	No	The paper does not describe any experiments performed on specific datasets. Therefore, no information on dataset splits (training, validation, test) is provided.
Hardware Specification	No	Acknowledgments: This study was funded by (...) Intel Inc., for providing hardware resource and Dev Cloud, used in part of the experiments. This mentions hardware resources but lacks specific details like GPU/CPU models or memory amounts.
Software Dependencies	No	The Python version has widespread and robust open source libraries, such as numpy, sklearn, and scipy. Similarly, the R version uses robust open sources libraries, such as the stats and utils but also more advanced libraries like e1071, rpart, infotheo and rrcov. No specific version numbers are provided for these libraries.
Experiment Setup	No	The paper describes the design and implementation of the MFE packages, including general functionalities like selecting meta-features and summary functions. However, it does not detail any specific experimental setup, hyperparameters, or training configurations for a meta-learning task.