reproducibilityindex.ai

Blind Pareto Fairness and Subgroup Robustness

Authors: Natalia L Martinez, Martin A Bertran, Afroditi Papadaki, Miguel Rodrigues, Guillermo Sapiro

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results show that the proposed framework improves worst-case risk in multiple standard datasets, while simultaneously providing better levels of service for the remaining population.
Researcher Affiliation	Academia	1Duke University, Durham, NC, USA 2University College London, London, UK.
Pseudocode	Yes	Algorithm 1 Blind Pareto Fairness Require: Inputs: Dataset {(xi, yi)}n i=1, partition size ρ Require: Hyper-parameters: rounds T, parameter η, adversary boundary coefﬁcient ϵ > 0, γ-approximate Bayesian solver M( ) arg minh H L(h, ) Initialize α0 = ˆα = {ρ}n i=1 Initialize classiﬁer and loss h0 = M( ˆα), L0 = L(h0, ˆα) for round t = 1, . . . , T do Adversary updates partition function by projected gradient ascent: αt αt 1 + η αL(ht, ˆα) = αt 1 + η ℓ(ht,y) nρ ˆα Q(αt) Uϵ,ρ , Uϵ,ρ = {α : αi [ϵ, 1], P Solver approximately solves for the current partition: ht M( ˆα) end for Return: Classiﬁer h T
Open Source Code	Yes	The code is available at github.com/natalialmg/Blind Pareto Fairness.
Open Datasets	Yes	Datasets. We used four standard fairness datasets for comparison. The UCI Adult dataset (Dua & Graff, 2017) which contains... The Law School dataset (Wightman, 1998) contains... The COMPAS dataset (Barenstein, 2019) which contains... Lastly we used the MIMIC-III dataset, which consists of clinical records... (Johnson et al., 2016).
Dataset Splits	Yes	Results correspond to the best hyper-parameter for each group size; mean and standard deviations are computed using 5-fold cross-validation, all ﬁgures are reported on held out (test) data.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instances used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as programming language versions or library versions (e.g., Python, PyTorch, TensorFlow versions), that would be necessary to precisely replicate the experiments.
Experiment Setup	Yes	We train BPF for 18 minimum group sizes ρ = {0.05, . . . , 0.9} and ϵ = 0.01... DRO models were trained on 18 equispaced values of their threshold parameter (η ∈ [0, 1]). For ARL, we tried four conﬁgurations for their adversarial network (adversary with 1 or 2 hidden layers with 256 or 512 units each)... The classiﬁer architecture for BPF, ARL, and DRO was standardized to a single-layer MLP with 512 hidden units. In all cases we use cross-entropy loss and same input data. Results correspond to the best hyper-parameter for each group size; mean and standard deviations are computed using 5-fold cross-validation...