Hunting for Discriminatory Proxies in Linear Regression Models
Authors: Samuel Yeom, Anupam Datta, Matt Fredrikson
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present empirical results on two law enforcement datasets that exhibit varying degrees of racial disparity in prediction outcomes, demonstrating that proxies shed useful light on the causes of discriminatory behavior in models. and Finally, in Section 5 we evaluate our algorithm with two real-world predictive policing applications. |
| Researcher Affiliation | Academia | Samuel Yeom Carnegie Mellon University syeom@cs.cmu.edu Anupam Datta Carnegie Mellon University danupam@cmu.edu Matt Fredrikson Carnegie Mellon University mfredrik@cs.cmu.edu |
| Pseudocode | No | The paper presents optimization problems (Problem 1 and Problem 2) but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using the 'cvxopt package [2] in Python' and provides its URL (http://cvxopt.org), but this is a third-party tool used by the authors, not a statement that they are releasing their own source code for the methodology presented. |
| Open Datasets | Yes | We ran our proxy detection algorithms on observational data from Chicago s Strategic Subject List (SSL) model [9] and the Communities and Crimes (C&C) dataset [15]. and references [9] 'City of Chicago. Strategic Subject List. https://data.cityofchicago.org/Public-Safety/Strategic-Subject-List/4aki-r3np, 2017.' and [15] 'UCI machine learning repository. https://archive.ics.uci.edu/ml, 2017.' |
| Dataset Splits | No | The paper mentions using datasets for evaluation and training a linear regression model, but it does not provide specific details on train/validation/test splits, percentages, or sample counts needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper states that the algorithms were implemented 'with the cvxopt package [2] in Python' but does not provide any specific hardware details such as GPU or CPU models, memory specifications, or cloud resources used for the experiments. |
| Software Dependencies | No | The paper states, 'We implemented Problems 1 and 2 with the cvxopt package [2] in Python.' While it names the software, it does not specify version numbers for either 'cvxopt' or 'Python', which is required for reproducibility. |
| Experiment Setup | Yes | For example, one proxy consisting of 58 of the 90 input variables achieves an influence of 0.34 when ϵ = 0.85. and The strengths of the proxies for race are given in Table 1. The estimated influence was computed as (c T α)2/Var( ˆY ). |