QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning

Authors: Kaan Ozkara, Navjot Singh, Deepesh Data, Suhas Diggavi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerically, we validate that Qu Pe D outperforms competing personalized FL methods, Fed Avg, and local training of clients in various heterogeneous settings. We perform image and text classification experiments on multiple datasets in various resource and data heterogeneity settings, and compare performance of Qu Pe D against Per-Fed Avg [8], p Fed Me [7], Fed Avg [26], and local training of clients.
Researcher Affiliation Academia Kaan Ozkara University of California, Los Angeles kaan@ucla.edu Navjot Singh University of California, Los Angeles navjotsingh@ucla.edu Deepesh Data University of California, Los Angeles deepesh.data@gmail.com Suhas Diggavi University of California, Los Angeles suhas@ee.ucla.edu
Pseudocode Yes Algorithm 1 Centralized Model Quantization; Algorithm 2 Qu Pe D: Quantized Personalization via Distillation
Open Source Code No The paper does not provide any statement or link indicating the release of open-source code for the described methodology.
Open Datasets Yes We consider an image classification task on FEMNIST [4] and CIFAR-10 [18] datasets.
Dataset Splits Yes To simulate data heterogeneity on CIFAR-10, similar to [26], we allow each client to have access to data samples from only 4 randomly chosen classes. Thus, each client has 1000 training samples and 200 test samples. On FEMNIST, we use a subset of 198 writers from the dataset and distribute the data so that each client has access to data samples written by 3 randomly chosen writers. The number of training samples per client varies between 203-336 and test samples per client varies between 25-40.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup Yes For all methods, if applicable, we set τ = 10 local iterations, use learning rate decay 0.99 and use weight decay of 10 4; we fine tune the initial learning rate for each method independently, see Appendix E for details. For CIFAR-10 we choose a batch size of 25. For FEMNIST, we choose variable batch sizes to have 60 iterations for all clients per epoch. We train each algorithm for 250 epochs on CIFAR-10 and 30 epochs on FEMNIST. We use last 50 epochs on CIFAR-10, and 5 epochs on FEMNIST for the fine-tuning phase.