reproducibilityindex.ai

Disparate Impact on Group Accuracy of Linearization for Private Inference

Authors: Saswat Das, Marco Romanelli, Ferdinando Fioretto

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper builds on this body of work and observes that while these methods are indeed effective at trading run-time reduction for marginal global accuracy losses, these accuracy decreases are unevenly distributed among different subgroups. We find that the accuracy loss is more pronounced for underrepresented subgroups and that this impact intensifies as more Re LUs are linearized. This effect is depicted in Figure 1, which shows the accuracy impact across age groups on a facial recognition task. [...] 5. Empirical Results on the Fairness Analysis
Researcher Affiliation	Academia	1University of Virginia, Charlottesville, VA, USA 2New York University, New York, NY, USA.
Pseudocode	Yes	Algorithm 1 reports the proposed fairness-aware finetuning method, replacing the standard finetuning step of SNL and DR.
Open Source Code	Yes	1Code: https://github.com/Saswat D27/ICML_ Linearization_Disparate_Impact
Open Datasets	Yes	Datasets and Models. We adopt three datasets: UTKFace (Zhang et al., 2017). SVHN (Digits) (Netzer et al., 2011). CIFAR-10 (Krizhevsky & Hinton, 2009).
Dataset Splits	No	The paper refers to 'training set' and 'evaluation set' but does not specify distinct train/validation/test splits, exact percentages, or sample counts for these partitions. It mentions 'All reported metrics are average over 10 random seeds' which relates to reproducibility but not data splitting details.
Hardware Specification	Yes	Each of the results in this paper was produced using an A100 GPUs with 80 GB of GPU memory, up to 100 GB of RAM, and up to 10 Intel(R) Xeon(R) E5-2630 v3 CPUs each clocked at 2.40GHz.
Software Dependencies	No	The paper mentions using 'official implementation' for SNL and DR but does not specify general software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For SNL, we train each base model for 160 epochs, and then perform the SNL finetuning step after Re LU linearization. [...] For Deep Re Duce, we use the official implementation... which trains each model for 200 epochs. [...] All reported metrics are average over 10 random seeds.