Zeroth-Order Methods for Nondifferentiable, Nonconvex, and Hierarchical Federated Optimization
Authors: Yuyang Qiu, Uday Shanbhag, Farzad Yousefian
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate the theory on nonsmooth and hierarchical ML problems. We present three sets of experiments to validate the performance of the proposed algorithms. In Section 5.1, we implement Algorithm 1 on Re LU neural networks (NNs) and compare it with some recent FL methods. In Sections 5.2 and 5.3 we implement Algorithm 2 on federated hyperparameter learning and a minimax formulation in FL. Throughout, we use the MNIST dataset. Additional experiments on a higher dimensional dataset (i.e., Cifar-10) are presented in supplementary material. |
| Researcher Affiliation | Academia | Yuyang Qiu Dept. of Industrial and Systems Engg. Rutgers University yuyang.qiu@rutgers.edu Uday V. Shanbhag Dept. of Industrial and Manufacturing Engg. Pennsylvania State University udaybag@psu.edu Farzad Yousefian Dept. of Industrial and Systems Engg. Rutgers University farzad.yousefian@rutgers.edu |
| Pseudocode | Yes | Algorithm 1 Randomized Zeroth-Order Locally-Projected Federated Averaging (Fed RZOnn) Algorithm 2 Randomized Implicit Zeroth-Order Federated Averaging (Fed RZObl) Algorithm 3 Fed Avg (x, r, y0,r, m, γ, H, T R) for lower level |
| Open Source Code | No | The paper does not contain any statement about making its source code available or provide a link to a code repository. |
| Open Datasets | Yes | We present three sets of experiments to validate the performance of the proposed algorithms. In Section 5.1, we implement Algorithm 1 on Re LU neural networks (NNs) and compare it with some recent FL methods. In Sections 5.2 and 5.3 we implement Algorithm 2 on federated hyperparameter learning and a minimax formulation in FL. Throughout, we use the MNIST dataset. Additional experiments on a higher dimensional dataset (i.e., Cifar-10) are presented in supplementary material. |
| Dataset Splits | No | The paper mentions distributing the training dataset among clients and using MNIST and CIFAR-10, but it does not specify explicit training, validation, and test splits (e.g., percentages, sample counts, or references to standard splits for these datasets) that would allow for reproduction of data partitioning. |
| Hardware Specification | No | The paper describes the experimental setup and training details, but it does not provide any specific information regarding the hardware used (e.g., CPU, GPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper mentions using ReLU neural networks and comparing with other FL methods like Fed Avg, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions). |
| Experiment Setup | Yes | Setup. We distribute the training dataset among m := 5 clients and implement Fed RZOnn for the FL training with N1 := 4 neurons under three different settings for the smoothing parameter, η {0.1, 0.01, 0.001}, γ := 10 5, and λ := 0.01. We study the performance of the method under different number of local steps with H {1, 5, 10, 20}. |