Disparate Impact on Group Accuracy of Linearization for Private Inference
Authors: Saswat Das, Marco Romanelli, Ferdinando Fioretto
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper builds on this body of work and observes that while these methods are indeed effective at trading run-time reduction for marginal global accuracy losses, these accuracy decreases are unevenly distributed among different subgroups. We find that the accuracy loss is more pronounced for underrepresented subgroups and that this impact intensifies as more Re LUs are linearized. This effect is depicted in Figure 1, which shows the accuracy impact across age groups on a facial recognition task. [...] 5. Empirical Results on the Fairness Analysis |
| Researcher Affiliation | Academia | 1University of Virginia, Charlottesville, VA, USA 2New York University, New York, NY, USA. |
| Pseudocode | Yes | Algorithm 1 reports the proposed fairness-aware finetuning method, replacing the standard finetuning step of SNL and DR. |
| Open Source Code | Yes | 1Code: https://github.com/Saswat D27/ICML_ Linearization_Disparate_Impact |
| Open Datasets | Yes | Datasets and Models. We adopt three datasets: UTKFace (Zhang et al., 2017). SVHN (Digits) (Netzer et al., 2011). CIFAR-10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | No | The paper refers to 'training set' and 'evaluation set' but does not specify distinct train/validation/test splits, exact percentages, or sample counts for these partitions. It mentions 'All reported metrics are average over 10 random seeds' which relates to reproducibility but not data splitting details. |
| Hardware Specification | Yes | Each of the results in this paper was produced using an A100 GPUs with 80 GB of GPU memory, up to 100 GB of RAM, and up to 10 Intel(R) Xeon(R) E5-2630 v3 CPUs each clocked at 2.40GHz. |
| Software Dependencies | No | The paper mentions using 'official implementation' for SNL and DR but does not specify general software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For SNL, we train each base model for 160 epochs, and then perform the SNL finetuning step after Re LU linearization. [...] For Deep Re Duce, we use the official implementation... which trains each model for 200 epochs. [...] All reported metrics are average over 10 random seeds. |