Breaking the centralized barrier for cross-device federated learning

Authors: Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian U. Stich, Ananda Theertha Suresh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also perform a thorough experimental exploration of MIME s performance on real world datasets (implemented here). We report the results of thorough experimental analysis demonstrating that both MIME and MIMELITE indeed converge faster than FEDAVG.
Researcher Affiliation Collaboration Sai Praneeth Karimireddy EPFL sai.karimireddy@epfl.ch Martin Jaggi EPFL martin.jaggi@epfl.ch Satyen Kale Google Research satyenkale@google.com Mehryar Mohri Google Research mohri@google.com Sashank J. Reddi Google Research sashank@google.com Sebastian U. Stich EPFL sebastian.stich@epfl.ch Ananda Theertha Suresh Google Research theertha@google.com
Pseudocode Yes Algorithm 1 Mime and Mime Lite
Open Source Code No The paper mentions using TensorFlow Federated [60] and cites other frameworks like Fed JAX [52, 53], but it does not provide an explicit statement or link to the open-source code for the specific methodology (MIME) presented in this paper.
Open Datasets Yes We run five simulations on three real-world federated datasets: EMNIST62 with i) a linear classifier, ii) an MLP, and iii) a CNN, iv) a char RNN on Shakespeare, and v) an LSTM for next word prediction on Stack Overflow, all accessed through Tensorflow Federated [60].
Dataset Splits No The paper uses 'Validation % accuracies' in Table 2 but does not explicitly provide specific details on the train/validation/test dataset splits, such as percentages or sample counts.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, memory specifications, or cloud computing instance types used for running the experiments.
Software Dependencies No The paper mentions 'TensorFlow Federated [60]' as a tool used, but it does not provide specific version numbers for TensorFlow Federated or any other software dependencies needed to replicate the experiments.
Experiment Setup Yes The learning rates were individually tuned and other optimizer hyper-parameters such as β for momentum, β1, β2, ε0 for Adam and Ada Grad were left to their default values, unless explicitly stated otherwise. We train a 2 hidden layer (300µ-100) MLP on EMNIST62 with 10 local epochs for 1k rounds.