Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fair Ranking with Noisy Protected Attributes
Authors: Anay Mehrotra, Nisheeth Vishnoi
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we observe that, compared to baselines, our algorithm outputs rankings with higher fairness, and has a similar or better fairness-utility trade-off compared to baselines. ... In this section we evaluate our framework s performance on synthetic and real-world data. |
| Researcher Affiliation | Academia | Anay Mehrotra Yale University Nisheeth K. Vishnoi Yale University |
| Pseudocode | Yes | Our algorithm (Algorithm 1) solves the standard linear programming relaxation of Program (7) to find a solution Rc and then uses a dependent-rounding algorithm by [19] to convert Rc to a ranking. |
| Open Source Code | Yes | Code for our simulations is available at https://github.com/AnayMehrotra/FairRankingWithNoisyAttributes |
| Open Datasets | Yes | We use the Occupations dataset [15]... We consider the chess ranking data [29] |
| Dataset Splits | No | The paper mentions parameters like m=500, n=25, and varies phi, but does not explicitly provide information on train/validation/test splits by percentage, sample counts, or reference predefined splits for reproducibility. It only states "We perform this calibration once and on all occupations and, then, use it for gender-stereotypical occupations" for data usage. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions using a "CNN-based gender-classifier f [59]" but does not specify the version of this classifier or any other software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Setup. We consider the DCG model of utilities [33] and a relaxation of equal representation constraints: (1) Given an intrinsic value wi 0, for each item i, we set Wij := wi (log (j + 1)) 1 j [n]. (2) Given a parameter ϕ [1, p], we set upper bounds Ukℓ:= ϕ p k for each k [n] and ℓ [p]. In simulations, we set m = 500, n = 25, and vary ϕ from p to 1. ... As a heuristic, we set γk = 1 20 maxℓ [p] q 1 Ukℓ in all simulations. |