Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
The Impact of Features Used by Algorithms on Perceptions of Fairness
Authors: Andrew Estornell, Tina Zhang, Sanmay Das, Chien-Ju Ho, Brendan Juba, Yevgeniy Vorobeychik
IJCAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate perceptions of fairness in the choice of features that algorithms use about individuals in a simulated gigwork employment experiment. |
| Researcher Affiliation | Academia | 1Washington University in Saint Louis 2Amherst College 3George Mason University |
| Pseudocode | No | The paper describes the experimental procedures and algorithms in prose within the 'Experimental Design' section, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing source code for the methodology, nor does it include a link to a code repository. |
| Open Datasets | No | For our experiment, we recruited a total of 1568 participants from Amazon Mechanical Turk, restricting location to be in the United States. |
| Dataset Splits | No | The experiment involves human participants divided into 'selectors' and 'workers', and workers were randomly assigned to treatment groups or split into 'hired' and 'not hired'. However, there is no mention of traditional machine learning dataset splits (e.g., train/validation/test percentages or counts) as no model was actually trained or evaluated on a dataset. |
| Hardware Specification | No | The paper describes a human-subjects experiment and mentions that 'no actual algorithms were developed or deployed', meaning computational hardware for model training or inference is not relevant to the experimental setup. Therefore, no specific hardware specifications are provided. |
| Software Dependencies | No | The paper mentions using 'Amazon Mechanical Turk' for participant recruitment and statistical tests like 'one-sided t-tests' and 'one-sided proportion z-tests' for analysis, but it does not provide specific software names with version numbers for any of these tools or for the experimental setup. |
| Experiment Setup | Yes | We designed a human subjects experiment in which participants were split into two roles: selectors, who are asked to choose which hiring algorithm we should use, and (prospective) workers, who are then hired, or not, via the chosen hiring algorithm. ... This experiment has two treatments on accuracy differences among algorithms: small ( 5%) and large ( 10%). |