reproducibilityindex.ai

Domain Adaptation meets Individual Fairness. And they get along.

Authors: Debarghya Mukherjee, Felix Petersen, Mikhail Yurochkin, Yuekai Sun

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We verify our theoretical findings empirically. Our goal is to improve performance under distribution shifts using individual fairness methods. We consider Sen Se I [3], Sensitive Subspace Robustness (Sen SR) [14], Counterfactual Logit Pairing (CLP) [28], and GLIF [9]. GLIF, similar to domain adaptation methods, requires unlabeled samples from the target. The other methods only utilize the source data as in the domain generalization scenario. Our theory establishes guarantees on the target domain performance for Sen Se I (Section 2.3) and GLIF (Section 2.1).
Researcher Affiliation	Collaboration	Debarghya Mukherjee Princeton University University of Michigan mdeb@umich.edu Felix Petersen Stanford University University of Konstanz mail@felix-petersen.de Mikhail Yurochkin IBM Research, MIT-IBM Watson AI Lab mikhail.yurochkin@ibm.com Yuekai Sun University of Michigan yuekai@umich.edu
Pseudocode	No	The paper does not contain any sections explicitly labeled "Pseudocode" or "Algorithm", nor does it present any structured, code-like blocks describing procedures.
Open Source Code	No	The paper does not contain an explicit statement about the release of its own source code, nor does it include a link to a code repository for the methodology described.
Open Datasets	Yes	We verify our theory on the Bios [7] and the Toxicity [8] datasets: enforcing IF via the methods of Yurochkin et al. [3] and Petersen et al. [9] improves accuracy on the target domain, and DA methods [10] [12] trained with appropriate source and target domains improve IF.
Dataset Splits	No	The paper mentions 'ns labeled samples from the source domain' and 'nt unlabeled samples from the target domain' in its theoretical sections. For the empirical results, it discusses 'training data' and 'target domain' but does not provide specific numerical percentages or sample counts for training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware specifications (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks) used in the experiments.
Experiment Setup	No	The paper refers to Appendix F for experimental details, but Appendix F primarily discusses datasets and metrics. It does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings.