Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
FedAvg with Fine Tuning: Local Updates Lead to Representation Learning
Authors: Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct experiments to (I) verify our theoretical results in the linear setting and (II) determine whether our established insights generalize to deep neural networks. We use the image classification datasets CIFAR-10 and CIFAR-100 [57], which consist of 10 and 100 classes of RGB images, respectively. |
| Researcher Affiliation | Academia | Liam Collins ECE Department The University of Texas at Austin EMAIL Hamed Hassani ESE Department University of Pennsylvania EMAIL Aryan Mokhtari ECE Department The University of Texas at Austin EMAIL Sanjay Shakkottai ECE Department The University of Texas at Austin EMAIL |
| Pseudocode | No | The paper describes the Fed Avg algorithm with equations (2) and (3) but does not present it or any other procedure in a structured pseudocode block or algorithm environment. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Please see Appendix C and the supplementary material. |
| Open Datasets | Yes | We use the image classification datasets CIFAR-10 and CIFAR-100 [57], which consist of 10 and 100 classes of RGB images, respectively. |
| Dataset Splits | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] Please see Appendix C. Image classes are heterogeneously allocated to M = 100 clients according to the Dirichlet distribution with parameter 0.6 as in [59]. |
| Hardware Specification | Yes | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] Please see Appendix C. |
| Software Dependencies | No | The provided text does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8'). |
| Experiment Setup | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] Please see Appendix C. Then we run Fed Avg with τ = 2 local updates and D-GD, both sampling m = M clients per round. We fine-tune using GD for τ = 200 iterations with batch size b = n. |