Toward a Perspectivist Turn in Ground Truthing for Predictive Computing

Authors: Federico Cabitza, Andrea Campagner, Valerio Basile

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this article, we describe and advocate for a different paradigm, which we call perspectivism: this counters the removal of disagreement and, consequently, the assumption of correctness of traditionally aggregated gold-standard datasets, and proposes the adoption of methods that preserve divergence of opinions and integrate multiple perspectives in the ground truthing process of ML development. Drawing on previous works which inspired it, mainly from the crowdsourcing and multi-rater labeling settings, we survey the state-of-the-art and describe the potential of our proposal
Researcher Affiliation Academia Federico Cabitza1, 2, Andrea Campagner2, Valerio Basile3 1 Department of Informatics, Systems and Communication, University of Milano-Bicocca, v.le Sarca 336 20126 Milan, Italy 2 IRCCS Istituto Ortopedico Galeazzi, Milan, Italy 3 University of Turin, C.so Svizzera 185 10149 Turin, Italy federico.cabitza@unimib.it, a.campagner@campus.unimib.it, valerio.basile@unito.it
Pseudocode No The paper includes a 'BPMN (Business Process Model and Notation) diagram' (Figure 1) to illustrate a process, but it does not contain structured pseudocode or explicitly labeled algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or a link to open-source code for the methodology or framework it describes. Footnote 2 links to an arXiv extended version of the paper, not to source code. The paper is primarily theoretical and a survey, not presenting a new implementable method with associated code.
Open Datasets No The paper is theoretical and reviews existing literature, proposing a conceptual framework. It does not conduct its own experiments or use datasets for training. It references datasets used in other research (e.g., 'the original Image Net dataset', 'hate speech corpora'), but not for its own experimental purposes.
Dataset Splits No This paper is theoretical and proposes a conceptual framework; it does not conduct experiments requiring dataset splits. Therefore, it does not provide train/validation/test dataset splits.
Hardware Specification No The paper is theoretical and does not describe any experiments that would require specific hardware. Therefore, it does not provide details on hardware specifications.
Software Dependencies No The paper is theoretical and does not describe any experiments or implementations that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and proposes a conceptual framework and research agenda. It does not describe any experiments, and therefore, does not provide specific details about an experimental setup or hyperparameters.