reproducibilityindex.ai

Fair Bayes-Optimal Classifiers Under Predictive Parity

Authors: Xianli Zeng, Edgar Dobriban, Guang Cheng

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide supporting experiments conducted on synthetic and empirical data.
Researcher Affiliation	Academia	Xianli Zeng NUS (Chongqing) Research Institute Chongqing, China zengxl19911214@gmail.com Edgar Dobriban University of Pennsylvania Philadelphia, PA 19104 dobriban@wharton.upenn.edu Guang Cheng University of California, Los Angeles Los Angeles, CA 90095 guangcheng@ucla.edu
Pseudocode	Yes	Algorithm 1 Fair Bayes-DPP
Open Source Code	No	The paper does not provide a specific link or explicit statement about the availability of its source code.
Open Datasets	Yes	We test Fair Bayes-DPP on two benchmark datasets for fair classification: Adult [14] and COMPAS [27]. To further test the performance of our algorithm on a large-scale dataset, we conduct experiments on the Celeb Faces Attributes (Celeb A) Dataset [32].
Dataset Splits	Yes	For each dataset, we randomly sample (with replacement) 70%, 50% and 30% as the training, validation and test set, respectively.
Hardware Specification	No	The paper mentions training models and using PyTorch, and a ResNet50 model, but does not specify any particular GPU or CPU models, memory sizes, or detailed hardware specifications used for experiments.
Software Dependencies	No	The paper mentions "Py Torch" as a software dependency but does not specify its version number or any other software dependencies with their versions.
Experiment Setup	Yes	We set the cost parameter c = 0.5, while P(A = 1) = 0.3 and P(Y = 1\|A = 0) = 0.2. In the Gaussian case, the Bayes-optimal classifier is linear in x and thus we employ logistic regression to learn η1( ) and η0( ). We then search over a grid with spacings equal to 0.001 over the range we identified in Section 5 for the empirically optimal thresholds under fairness. The conditional probabilities are learned via a three-layer fully connected neural network architecture with 32 hidden neurons per layer. We refer readers to the Appendix for more training details, including optimizer, learning rates, batch sizes and training epochs.