reproducibilityindex.ai

An Agnostic Approach to Federated Learning with Class Imbalance

Authors: Zebang Shen, Juan Cervino, Hamed Hassani, Alejandro Ribeiro

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through an extensive empirical study over various data heterogeneity and class imbalance conﬁgurations, we showcase that CLIMB considerably improves the performance in the minority class without compromising the overall accuracy of the classiﬁer, which signiﬁcantly outperforms previous arts.
Researcher Affiliation	Academia	Zebang Shen, Juan Cervino, Hamed Hassani, Alejandro Ribeiro Department of Electrical and Systems Engineering University of Pennsylvania Philadelphia, PA 19104, USA {zebang,jcervino,hassani,aribeiro}@seas.upenn.edu
Pseudocode	Yes	Algorithm 1 CLIMB: CLass IMBalance Federated Learning
Open Source Code	No	The code can be found here. (Note: The provided text does not include an actual link for concrete access to source code.)
Open Datasets	Yes	Three benchmark datasets are used in our experiments with the default train/test splits, which are MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky et al., 2009) and Fashion-MNIST (Xiao et al., 2017).
Dataset Splits	No	The paper mentions "default train/test splits" but does not explicitly describe a validation split or provide specific details on how validation data was used for reproducibility.
Hardware Specification	No	The paper does not provide SPECIFIC HARDWARE DETAILS (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide SPECIFIC ANCILLARY SOFTWARE DETAILS (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	We generate heterogeneity in the local data distributions according to the strategy from (Karimireddy et al., 2020; Hsu et al., 2019): Let α [0, 1] be some constant that determines the level of heterogeneity. For a ﬁxed α, we divide the dataset among N = 100 (moderate) or N = 500 (massive) clients as follows: for we allocate to each client a portion of α i.i.d. data and the remaining portion of (1 α) by sorting according to label. ... For the minority class(es), we retain only 1/ρ portion of the corresponding data. Here, ρ 1 the ratio between the numbers of data in the majority class and in the minority class and is termed the imbalance ratio. ... In our experiments, we consider the setting of 1 or 3 minority classes and we take ρ = 5, 10, 20. ... Speciﬁcally, we use a 2 hidden layer fully-connected neural network for MNIST, where the numbers of neurons are (128, 128). For CIFAR10, we use a CNN model consisting of 2 convolutional layers with 64 5 5 ﬁlters followed by 2 fully connected layers with 394 and 192 neurons. ... The base FL solver is Fed-Avg with partial-participation: 100 devices participate in every communication round.