Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Inherent Tradeoffs in Learning Fair Representations

Authors: Han Zhao, Geoffrey J. Gordon

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we also conduct experiments on real-world datasets to confirm our theoretical findings.
Researcher Affiliation Academia Han Zhao EMAIL University of Illinois at Urbana-Champaign Geoffrey J. Gordon EMAIL Carnegie Mellon University
Pseudocode Yes Algorithm 1 Optimal fair classifier Input: Oracle access to h 0 and h 1, the Bayes optimal classifiers over ยต0 and ยต1 Output: A randomized optimal fair classifier h Fair : X A Y
Open Source Code No The paper does not explicitly state that source code for the methodology is provided, nor does it include a link to a code repository.
Open Datasets Yes Dataset The Adult dataset contains 30,162/15,060 training/test instances for income prediction. Each instance in the dataset describes an adult from the 1994 US Census.
Dataset Splits Yes Dataset The Adult dataset contains 30,162/15,060 training/test instances for income prediction.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments. It only mentions a "Nvidia GPU grant" in the acknowledgments, which is not a hardware specification for the experiments.
Software Dependencies No The paper discusses concepts like "cross-entropy loss" and "logistic regression model" and refers to "optimization algorithm" but does not list specific software libraries or their version numbers.
Experiment Setup Yes Experimental Protocol To validate the effect of learning group-invariant representations with adversarial debiasing techniques (Zhang et al., 2018; Madras et al., 2018; Beutel et al., 2017), we perform a controlled experiment by fixing the baseline network architecture to be a three hidden-layer feed-forward network with Re LU activations. The number of units in each hidden layer are 500, 200, and 100, respectively. The output layer corresponds to a logistic regression model. This baseline without debiasing is denoted as No Debias. For debiasing with adversarial learning techniques, the adversarial discriminator network takes the feature from the last hidden layer as input, and connects it to a hidden-layer with 50 units, followed by a binary classifier whose goal is to predict the sensitive attribute A. This model is denoted as Adv Debias. Compared with No Debias, the only difference of Adv Debias in terms of objective function is that besides the cross-entropy loss for target prediction, the Adv Debias also contains a classification loss from the adversarial discriminator to predict the sensitive attribute A. In the experiment, all the other factors are fixed to be the same between these two methods, including learning rate, optimization algorithm, training epoch, and also batch size. To see how the adversarial loss affects the joint error, the demographic parity as well as the accuracy parity, we vary the coefficient ฯ for the adversarial loss between 0.1, 1.0, 5.0 and 50.0.