On The Fairness Impacts of Hardware Selection in Machine Learning
Authors: Sree Harsha Nelaturu, Nishaanth Kanna Ravichandran, Cuong Tran, Sara Hooker, Ferdinando Fioretto
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through both theoretical and empirical analysis, the paper not only identifies the underlying factors but also proposes an effective strategy for mitigating hardware-induced performance imbalances.Our study stands out for its breadth, conducting experiments that cover a range of hardware architectures, datasets, and model types and the reported results highlight the critical influence of hardware on both performance and ethical dimensions of machine learning models. |
| Researcher Affiliation | Collaboration | 1Cohere For AI Community 2Saarland University 3Dyania Health 4University of Virginia 5Cohere For AI. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not include an unambiguous statement from the authors that they are releasing their code for the work described, nor does it provide a direct link to a source-code repository containing their implementation. |
| Open Datasets | Yes | Our experiments were conducted using three key datasets: CIFAR-10 (Krizhevsky, 2009), Celeb A (Liu et al., 2015), and UTKFace (Zhang et2017). |
| Dataset Splits | No | For CIFAR10, there are 60000 images (50000 train 5000 per class, 10000 test 1000 per class). The paper does not explicitly provide training/test/validation dataset splits including a validation set or clear methodology for its creation. |
| Hardware Specification | Yes | The experiments use a variety of GPUs: Tesla T4 (NVIDIA, 2018a), Tesla V100 (NVIDIA, 2017a), Ampere A100 (NVIDIA, 2021), and Ada L4 GPU (NVIDIA, 2023).Table 1: Comparison of the feature design and system specifications of the hardware evaluated across all experiements. |
| Software Dependencies | Yes | We ensure determinism by fixing the random seed for all python libraries, including Py Torch (Paszke et al., 2019) 2.0, ensuring consistency. |
| Experiment Setup | Yes | For all experiments We used SGD+momentum 0.99, with weight decay of 5e 4, a three-phase one-cycle LR (Leslie, 2015) scheduler with a starting learning rate of 0.1. The batch size for CIFAR10, UTKFace and Celeb A is set to 512, 128 (32 for Res Net50) and 200, respectively. Models trained on CIFAR10 and Celeb A were trained for 15 epochs, and those trained on UTKFace for 20 epochs. |