Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Toward a Perspectivist Turn in Ground Truthing for Predictive Computing

Authors: Federico Cabitza, Andrea Campagner, Valerio Basile

AAAI 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this article, we describe and advocate for a different paradigm, which we call perspectivism: this counters the removal of disagreement and, consequently, the assumption of correctness of traditionally aggregated gold-standard datasets, and proposes the adoption of methods that preserve divergence of opinions and integrate multiple perspectives in the ground truthing process of ML development. Drawing on previous works which inspired it, mainly from the crowdsourcing and multi-rater labeling settings, we survey the state-of-the-art and describe the potential of our proposal
Researcher Affiliation	Academia	Federico Cabitza1, 2, Andrea Campagner2, Valerio Basile3 1 Department of Informatics, Systems and Communication, University of Milano-Bicocca, v.le Sarca 336 20126 Milan, Italy 2 IRCCS Istituto Ortopedico Galeazzi, Milan, Italy 3 University of Turin, C.so Svizzera 185 10149 Turin, Italy EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper includes a 'BPMN (Business Process Model and Notation) diagram' (Figure 1) to illustrate a process, but it does not contain structured pseudocode or explicitly labeled algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or a link to open-source code for the methodology or framework it describes. Footnote 2 links to an arXiv extended version of the paper, not to source code. The paper is primarily theoretical and a survey, not presenting a new implementable method with associated code.
Open Datasets	No	The paper is theoretical and reviews existing literature, proposing a conceptual framework. It does not conduct its own experiments or use datasets for training. It references datasets used in other research (e.g., 'the original Image Net dataset', 'hate speech corpora'), but not for its own experimental purposes.
Dataset Splits	No	This paper is theoretical and proposes a conceptual framework; it does not conduct experiments requiring dataset splits. Therefore, it does not provide train/validation/test dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe any experiments that would require specific hardware. Therefore, it does not provide details on hardware specifications.
Software Dependencies	No	The paper is theoretical and does not describe any experiments or implementations that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and proposes a conceptual framework and research agenda. It does not describe any experiments, and therefore, does not provide specific details about an experimental setup or hyperparameters.