Stability and Multigroup Fairness in Ranking with Uncertain Predictions
Authors: Siddartha Devic, Aleksandra Korolova, David Kempe, Vatsal Sharan
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run experiments to complement our theoretical results and further investigate the fairness-utility tradeoff. Our results demonstrate that r UA is far more stable than rτ opt in practice, and also achieves higher utility than two baseline ranking functions. We use the US Census data set ACS (Ding et al., 2021) and the student dropout task Enrollment (Martins et al., 2021). Table 1 shows the stability of r UA and rτ opt to noise introduced by neural networks trained with SGD, averaged over multiple runs. In Table 2, we report the utility of r UA, the uniform ranking runif assigning each individual to each rank with equal probability, and Plackett Luce rankings r PL (Plackett, 1975; Luce, 1959). |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Southern California 2Department of Computer Science and Public Affairs, Princeton University. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks with explicit labels like 'Algorithm' or 'Pseudocode'. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a specific repository link or an explicit statement of code release) for the methodology described. |
| Open Datasets | Yes | We use the US Census data set ACS (Ding et al., 2021) and the student dropout task Enrollment (Martins et al., 2021). |
| Dataset Splits | No | For computational reasons, we restrict our experiments to a subset of the data for California with parameters survey year= 2018 , horizon= 1-Year and survey= person . These parameters are standard when using ACS for testing algorithmic fairness methods, due to the large amount of available data. (See, e.g., the Git Hub repository of Ding et al. (2021).) We are left with 378,817 entries, and use an 80/20 train/test split. In Enrollment, the target is a multiclass variable for whether an individual is an enrolled, graduated, or dropout student. After cleaning the data, we are left with 4,424 entries, on which we use an 80/20 train/test split. The paper specifies train/test splits but does not explicitly mention a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (such as GPU/CPU models or memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions training 'three-layer MLP neural networks' with 'SGD' but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We train 30 simple three-layer MLP neural networks on the ACS data set, which we divide into 15 pairs of networks. Each pair of networks is initialized with the same (random) weight matrix, then trained separately with SGD. |