Zeroth-Order Methods for Nondifferentiable, Nonconvex, and Hierarchical Federated Optimization

Authors: Yuyang Qiu, Uday Shanbhag, Farzad Yousefian

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate the theory on nonsmooth and hierarchical ML problems. We present three sets of experiments to validate the performance of the proposed algorithms. In Section 5.1, we implement Algorithm 1 on Re LU neural networks (NNs) and compare it with some recent FL methods. In Sections 5.2 and 5.3 we implement Algorithm 2 on federated hyperparameter learning and a minimax formulation in FL. Throughout, we use the MNIST dataset. Additional experiments on a higher dimensional dataset (i.e., Cifar-10) are presented in supplementary material.
Researcher Affiliation Academia Yuyang Qiu Dept. of Industrial and Systems Engg. Rutgers University yuyang.qiu@rutgers.edu Uday V. Shanbhag Dept. of Industrial and Manufacturing Engg. Pennsylvania State University udaybag@psu.edu Farzad Yousefian Dept. of Industrial and Systems Engg. Rutgers University farzad.yousefian@rutgers.edu
Pseudocode Yes Algorithm 1 Randomized Zeroth-Order Locally-Projected Federated Averaging (Fed RZOnn) Algorithm 2 Randomized Implicit Zeroth-Order Federated Averaging (Fed RZObl) Algorithm 3 Fed Avg (x, r, y0,r, m, γ, H, T R) for lower level
Open Source Code No The paper does not contain any statement about making its source code available or provide a link to a code repository.
Open Datasets Yes We present three sets of experiments to validate the performance of the proposed algorithms. In Section 5.1, we implement Algorithm 1 on Re LU neural networks (NNs) and compare it with some recent FL methods. In Sections 5.2 and 5.3 we implement Algorithm 2 on federated hyperparameter learning and a minimax formulation in FL. Throughout, we use the MNIST dataset. Additional experiments on a higher dimensional dataset (i.e., Cifar-10) are presented in supplementary material.
Dataset Splits No The paper mentions distributing the training dataset among clients and using MNIST and CIFAR-10, but it does not specify explicit training, validation, and test splits (e.g., percentages, sample counts, or references to standard splits for these datasets) that would allow for reproduction of data partitioning.
Hardware Specification No The paper describes the experimental setup and training details, but it does not provide any specific information regarding the hardware used (e.g., CPU, GPU models, memory, or cloud instance types).
Software Dependencies No The paper mentions using ReLU neural networks and comparing with other FL methods like Fed Avg, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup Yes Setup. We distribute the training dataset among m := 5 clients and implement Fed RZOnn for the FL training with N1 := 4 neurons under three different settings for the smoothing parameter, η {0.1, 0.01, 0.001}, γ := 10 5, and λ := 0.01. We study the performance of the method under different number of local steps with H {1, 5, 10, 20}.