Fair Resource Allocation in Federated Learning

Authors: Tian Li, Maziar Sanjabi, Ahmad Beirami, Virginia Smith

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate both the effectiveness of q-FFL and the efficiency of q-Fed Avg on a suite of federated datasets with both convex and non-convex models, and show that q-FFL (along with q-Fed Avg) outperforms existing baselines in terms of the resulting fairness, flexibility, and efficiency.
Researcher Affiliation Collaboration Tian Li CMU tianli@cmu.edu Maziar Sanjabi Facebook AI maziars@fb.com Ahmad Beirami Facebook AI beirami@fb.com Virginia Smith CMU smithv@cmu.edu
Pseudocode Yes Algorithm 1 q-Fed SGD
Open Source Code Yes All code, data, and experiments are publicly available at github.com/litian96/fair_flearn.
Open Datasets Yes In particular, we study: (1) a synthetic dataset using a linear regression classifier, (2) a Vehicle dataset collected from a distributed sensor network (Duarte & Hu, 2004) with a linear SVM for binary classification, (3) tweet data curated from Sentiment140 (Go et al., 2009) (Sent140) with an LSTM classifier for text sentiment analysis, and (4) text data built from The Complete Works of William Shakespeare (Mc Mahan et al., 2017) and an RNN to predict the next character. When comparing with AFL, we use the two small benchmark datasets (Fashion MNIST (Xiao et al., 2017) and Adult (Blake, 1998)) studied in Mohri et al. (2019). When applying q-FFL to meta-learning, we use the common meta-learning benchmark dataset Omniglot (Lake et al., 2015).
Dataset Splits Yes We randomly split data on each local device into 80% training set, 10% testing set, and 10% validation set.
Hardware Specification Yes We simulate the federated setting (one server and m devices) on a server with 2 Intel R Xeon R E5-2650 v4 CPUs and 8 NVidia R 1080Ti GPUs.
Software Dependencies Yes We implement all code in Tensor Flow (Abadi et al., 2016) Version 1.10.1.
Experiment Setup Yes We tune a best q from {0.001, 0.01, 0.1, 0.5, 1, 2, 5, 10, 15} on the validation set and report accuracy distributions on the testing set. ... For all datasets, we randomly sample 10 devices each round. We tune the learning rate and batch size on Fed Avg and use the same learning rate and batch size for all q-Fed Avg experiments of that dataset. The learning rates for Synthetic, Vehicle, Sent140, and Shakespeare are 0.1, 0.01, 0.03, and 0.8, respectively. The batch sizes for Synthetic, Vehicle, Sent140, and Shakespeare are 10, 64, 32, and 10. The number of local epochs E is fixed to be 1 for both Fed Avg and q-Fed Avg regardless of the values of q.