reproducibilityindex.ai

Distributed Personalized Empirical Risk Minimization

Authors: Yuyang Deng, Mohammad Mahdi Kamani, Pouria Mahdavinia, Mehrdad Mahdavi

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experimental Results In this section we benchmark the effectiveness of PERM on synthetic data with 50 clients, where it notably outshone other renowned methods as evident in Figure 1. Our experiments concluded with the CIFAR10 dataset, employing a 2-layer convolutional neural network, where PERM, despite a warmup phase, demonstrated unmatched convergence performance (Figure 2). Additional experiments are reported in the appendix. Across all datasets, the PERM algorithm consistently showcased its robustness and unmatched efficiency in the realm of personalized federated learning.
Researcher Affiliation	Collaboration	Yuyang Deng Pennsylvania State University yzd82@psu.edu Mohammad Mahdi Kamani Wyze Labs mmkamani@alumni.psu.edu Pouria Mahdavinia Pennsylvania State University pxm5426@psu.edu Mehrdad Mahdavi Pennsylvania State University mzm616@psu.edu
Pseudocode	Yes	Algorithm 1: Shuffling Local SGD and Algorithm 2: Single Loop PERM
Open Source Code	No	The paper does not include an explicit statement about releasing its source code or a link to a code repository for the methodology described.
Open Datasets	Yes	Our experiments concluded with the CIFAR10 dataset, employing a 2-layer convolutional neural network, where PERM, despite a warmup phase, demonstrated unmatched convergence performance (Figure 2). Additional experiments are reported in the appendix. Experiment on EMNIST dataset In addition to the synthetic and CIFAR10 datasets discussed in the main body, we run experiments on the EMNIST dataset [59], which is naturally distributed in a federated setting. The effectiveness of learned mixture weights To show the effectiveness of the two-stage PERM algorithm, as well as the effects of heterogeneity on the distribution of data among clients on the learned weights α in the algorithm, we run this algorithm on MNIST dataset.
Dataset Splits	No	The paper mentions 'Personal Validation Accuracy' and 'Personal Validation Loss' but does not explicitly state the dataset splits (e.g., percentages, sample counts, or a reference to specific predefined splits) used for training, validation, and testing.
Hardware Specification	No	The paper discusses the importance of considering 'computational resources' and 'memory and compute resources' of client devices in general, but does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its own experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') used for its experiments.
Experiment Setup	Yes	For this specific experiment, we set µ1 = 0.2, µ2 = 0.2, and µw = 0.1. The data dimension is d = 60, and there are 2 classes in the output. We have a total of 50 clients, each generating 500 samples following the aforementioned guidelines. We train a logistic regression model on each client s data. We extend our experimentation to the CIFAR10 dataset using a 2-layer convolutional neural network. During this test, 50 clients participate, each limited to data from just 2 classes, resulting in a pronounced heterogeneous data distribution. ... It's noteworthy that PERM's initial personalized validation is significantly lower than that of approaches like Per Fed Avg and PFed Me. This discrepancy stems from our choice to implement 10 communication rounds as a warm-up phase before initiating personalization, whereas other models embark on personalization right from the outset. Each method undertakes 20 local steps along with their distinct computations for personalization.