reproducibilityindex.ai

Hunting for Discriminatory Proxies in Linear Regression Models

Authors: Samuel Yeom, Anupam Datta, Matt Fredrikson

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we present empirical results on two law enforcement datasets that exhibit varying degrees of racial disparity in prediction outcomes, demonstrating that proxies shed useful light on the causes of discriminatory behavior in models. and Finally, in Section 5 we evaluate our algorithm with two real-world predictive policing applications.
Researcher Affiliation	Academia	Samuel Yeom Carnegie Mellon University syeom@cs.cmu.edu Anupam Datta Carnegie Mellon University danupam@cmu.edu Matt Fredrikson Carnegie Mellon University mfredrik@cs.cmu.edu
Pseudocode	No	The paper presents optimization problems (Problem 1 and Problem 2) but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using the 'cvxopt package [2] in Python' and provides its URL (http://cvxopt.org), but this is a third-party tool used by the authors, not a statement that they are releasing their own source code for the methodology presented.
Open Datasets	Yes	We ran our proxy detection algorithms on observational data from Chicago s Strategic Subject List (SSL) model [9] and the Communities and Crimes (C&C) dataset [15]. and references [9] 'City of Chicago. Strategic Subject List. https://data.cityofchicago.org/Public-Safety/Strategic-Subject-List/4aki-r3np, 2017.' and [15] 'UCI machine learning repository. https://archive.ics.uci.edu/ml, 2017.'
Dataset Splits	No	The paper mentions using datasets for evaluation and training a linear regression model, but it does not provide specific details on train/validation/test splits, percentages, or sample counts needed to reproduce the data partitioning.
Hardware Specification	No	The paper states that the algorithms were implemented 'with the cvxopt package [2] in Python' but does not provide any specific hardware details such as GPU or CPU models, memory specifications, or cloud resources used for the experiments.
Software Dependencies	No	The paper states, 'We implemented Problems 1 and 2 with the cvxopt package [2] in Python.' While it names the software, it does not specify version numbers for either 'cvxopt' or 'Python', which is required for reproducibility.
Experiment Setup	Yes	For example, one proxy consisting of 58 of the 90 input variables achieves an inﬂuence of 0.34 when ϵ = 0.85. and The strengths of the proxies for race are given in Table 1. The estimated inﬂuence was computed as (c T α)2/Var( ˆY ).