Role of Human-AI Interaction in Selective Prediction

Authors: Elizabeth Bondi, Raphael Koster, Hannah Sheahan, Martin Chadwick, Yoram Bachrach, Taylan Cemgil, Ulrich Paquet, Krishnamurthy Dvijotham5286-5294

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that this is not the case by performing experiments to quantify human-AI interaction in the context of selective prediction.
Researcher Affiliation Collaboration 1Harvard University 2Deep Mind 3Google Brain
Pseudocode No The paper describes a 'Deferral workflow' diagram and mathematical formulations, but it does not contain structured pseudocode or algorithm blocks with labeled steps.
Open Source Code No The paper states 'Aggregated data are available at https://github.com/deepmind/HAI_selective_prediction/', but it does not explicitly state that source code for the methodology is provided.
Open Datasets Yes The dataset we use in this work is composed of images from camera traps... To alleviate this burden, the Snapshot Serengeti1 project was set up to allow volunteers to apply rich labels to camera trap images2. These labels are publicly available3, and ground truth comes from label consensus from multiple individuals (Swanson et al. 2015). Footnotes 1, 2, and 3 provide direct URLs to the dataset.
Dataset Splits No The paper mentions a 'Serengeti validation set' in the context of practice examples for human participants, and refers to 'tuning the deferral model', but it does not provide explicit details about training, validation, or test splits used for its own experimental setup or model development.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running its experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers, such as specific libraries or frameworks.
Experiment Setup Yes We therefore design a human participant experiment with all possible combinations of these two details: 1) Neither message (NM), 2) Deferral status only (DO), 3) Prediction only (PO), and 4) Both messages (BM)... In the labeling section, we display 80 model-deferred images (like Fig. 4) under the four SPM conditions (yielding 20 images per condition... The images are randomly allocated across the four communication conditions... the data are analysed in a 2x2x2 within-subject repeated measures ANOVA... recruited 198 participants from Prolific to take part in the experiment.