Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing
Authors: Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina F. Balcan, Virginia Smith, Ameet Talwalkar
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that Fed Ex can outperform natural baselines for federated hyperparameter tuning by several percentage points on the Shakespeare, FEMNIST, and CIFAR-10 benchmarks obtaining higher accuracy using the same training budget. |
| Researcher Affiliation | Collaboration | Mikhail Khodak, Renbo Tu, Tian Li Carnegie Mellon University {khodak,renbo,tianli}@cmu.edu Liam Li Hewlett Packard Enterprise me@liamcli.com Maria-Florina Balcan, Virginia Smith Carnegie Mellon University ninamf@cs.cmu.edu,smithv@cmu.edu Ameet Talwalkar Carnegie Mellon University & Hewlett Packard Enterprise talwalkar@cmu.edu |
| Pseudocode | Yes | Algorithm 1: Successive halving algorithm (SHA) applied to personalized FL. Algorithm 2: Fed Ex |
| Open Source Code | Yes | 3. (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See link in the Appendix. |
| Open Datasets | Yes | evaluating on three standard FL benchmarks: Shakespeare, FEMNIST, and CIFAR-10 [5, 36]. |
| Dataset Splits | Yes | For Shakespeare and FEMNIST we use 80% of the data for training and 10% each for validation and testing. In CIFAR-10 we hold out 10K examples from the usual training/testing split for validation. |
| Hardware Specification | No | 3. (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The main unit of cost in our setting is communication round, which we do report in e.g. Figure 4. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | No | The paper lists types of hyperparameters tuned (e.g., learning rate, batch-size, dropout) but defers the exact values or hyperparameter space to supplementary material: 'Please see the supplementary material for the exact hyperparameter space considered.' |