Empirical Likelihood for Fair Classification

Authors: Pangpang Liu, Yichuan Zhao

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Simulation studies show that our method exactly covers the target Type I error rate and effectively balances the trade-off between accuracy and fairness. Finally, we conduct data analysis to demonstrate the effectiveness of our method.
Researcher Affiliation Academia Pangpang Liu Mitchell E. Daniels, Jr. School of Business Purdue University West Lafayette, IN 47907, USA liu3364@purdue.edu Yichuan Zhao Department of Mathematics and Statistics Georgia State University Atlanta, GA 30303, USA yichuan@gsu.edu
Pseudocode No The paper does not include a dedicated pseudocode block or algorithm.
Open Source Code No The paper does not provide any explicit statements about making the source code available or include links to a code repository.
Open Datasets Yes Firstly, we apply our method on the ACS PUMS datasets (Ding et al., 2021), which encompass distribution shifts... Specifically, we use the German credit dataset (Dua & Graff, 2019), which contains 1000 instances of bank account holders and is commonly used for risk assessment prediction.
Dataset Splits No The paper mentions 'We partition the data into a training set (70%) and a test set (30%)' and 'we split the dataset into equal training and test sets.' but does not explicitly state a separate validation dataset split.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as GPU or CPU models.
Software Dependencies No The paper does not list any specific software dependencies with version numbers.
Experiment Setup Yes We generate 2000 binary class labels uniformly at random and assign a 2-dimensional feature vector to each label by drawing samples from two distinct Gaussian distributions: p(x|y = 1) = N([2; 2], [5, 1; 1, 5]) and p(x|y = 1) = N([ 2; 2], [10, 1; 1, 3]). We use x = [cos(ϕ), sin(ϕ); sin(ϕ), cos(ϕ)]x as a rotation of the feature vector x, and draw the one-dimensional sensitive attribute s from a Bernoulli distribution, p(s = 1) = p(x |y = 1)/[p(x |y = 1) + p(x |y = 1)]. The value of ϕ controls the correlation between the sensitive attribute and the class labels. We choose ϕ = π/3, α = 0.05. We partition the data into a training set (70%) and a test set (30%) and fit a logistic model (Appendix B).