Fair Classification with Partial Feedback: An Exploration-Based Data Collection Approach

Authors: Vijay Keswani, Anay Mehrotra, L. Elisa Celis

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluation on real-world datasets shows that this approach consistently boosts the quality of collected outcome data and improves the fraction of true positives for all groups, with only a small reduction in predictive utility.
Researcher Affiliation Academia 1Duke University 2Yale University.
Pseudocode Yes Algorithm 1 Data Collection and Prediction Framework
Open Source Code Yes 2Link to code: http://github.com/vijaykeswani/ Fair-Classification-with-Partial-Feedback/
Open Datasets Yes We first evaluate our framework over the new Adult Income dataset which contains demographic and financial data of around 251k individuals from California (Ding et al., 2021). Next, we evaluate the performance over the German Credit dataset (Hofmann, 1994)
Dataset Splits No The paper states, "The dataset is randomly split into 40 equal parts" for iterations, but does not provide specific train/validation/test split percentages, sample counts, or explicit fixed validation sets.
Hardware Specification No The paper does not provide specific details on the hardware used for running the experiments (e.g., GPU/CPU models, memory, or cluster specifications).
Software Dependencies No The paper mentions "Python's SLSQP program" and "logistic regression model" but does not provide specific version numbers for Python, SLSQP, or any other libraries or software dependencies beyond parameter settings.
Experiment Setup Yes Parameters "alpha" = 0.15 with "alpha"exploit = 0.075 t0.2 and "alpha"explore = "alpha" "alpha"exploit, "tau" = 0.5, "lambda" = 0, "epsilon" = 10 3. Utility is measured using the revenuec1,c2( ) metric with c1 = 500 and c2 = 200. For the SLSQP function (used to solve the optimization program), we use parameters ftol= 1e 3 and eps= 1e 3.