FedAvg with Fine Tuning: Local Updates Lead to Representation Learning
Authors: Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct experiments to (I) verify our theoretical results in the linear setting and (II) determine whether our established insights generalize to deep neural networks. We use the image classification datasets CIFAR-10 and CIFAR-100 [57], which consist of 10 and 100 classes of RGB images, respectively. |
| Researcher Affiliation | Academia | Liam Collins ECE Department The University of Texas at Austin liamc@utexas.edu Hamed Hassani ESE Department University of Pennsylvania hassani@seas.upenn.edu Aryan Mokhtari ECE Department The University of Texas at Austin mokhtari@austin.utexas.edu Sanjay Shakkottai ECE Department The University of Texas at Austin sanjay.shakkottai@utexas.edu |
| Pseudocode | No | The paper describes the Fed Avg algorithm with equations (2) and (3) but does not present it or any other procedure in a structured pseudocode block or algorithm environment. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Please see Appendix C and the supplementary material. |
| Open Datasets | Yes | We use the image classification datasets CIFAR-10 and CIFAR-100 [57], which consist of 10 and 100 classes of RGB images, respectively. |
| Dataset Splits | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] Please see Appendix C. Image classes are heterogeneously allocated to M = 100 clients according to the Dirichlet distribution with parameter 0.6 as in [59]. |
| Hardware Specification | Yes | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] Please see Appendix C. |
| Software Dependencies | No | The provided text does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8'). |
| Experiment Setup | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] Please see Appendix C. Then we run Fed Avg with τ = 2 local updates and D-GD, both sampling m = M clients per round. We fine-tune using GD for τ = 200 iterations with batch size b = n. |