Federated Conditional Stochastic Optimization

Authors: Xidong Wu, Jianhui Sun, Zhengmian Hu, Junyi Li, Aidong Zhang, Heng Huang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on various tasks validate the efficiency of these algorithms. ... 5 Experiments The experiments are run on CPU machines with AMD EPYC 7513 32-Core Processors as well as NVIDIA RTX A6000.
Researcher Affiliation Academia Xidong Wu Department of ECE University of Pittsburgh Pittsburgh, PA 15213 xidong_wu@outlook.com Jianhui Sun Computer Science University of Virginia Charlottesville, VA 22903 js9gu@virginia.edu Zhengmian Hu Computer Science University of Maryland College Park, MD 20742 huzhengmian@gmail.com Junyi Li Department of ECE University of Pittsburgh Pittsburgh, PA 15213 junyili.ai@gmail.com Aidong Zhang Computer Science University of Virginia Charlottesville, VA 22903 aidong@virginia.edu Heng Huang Computer Science University of Maryland College Park, MD 20742 henghuanghh@gmail.com
Pseudocode Yes Algorithm 1 FCSG and FCSG-M Algorithm ... Algorithm 2 Acc-FCSG-M Algorithm
Open Source Code Yes The code is available and Federated Online AUPRC maximization task follow [38] . https://github.com/xidongwu/Federated-Minimax-and-Conditional-Stochastic-Optimization/tree/main https://github.com/xidongwu/D-AUPRC
Open Datasets Yes We apply our methods to few-shot image classification on the Omniglot [24, 10]. ... We choose MNIST dataset and CIFAR-10 datasets.
Dataset Splits Yes We divide the characters to train/validation/test with 1028/172/423 by Torchmeta [7] and tasks are evenly partitioned into disjoint sets and we distribute tasks randomly among 16 clients.
Hardware Specification Yes The experiments are run on CPU machines with AMD EPYC 7513 32-Core Processors as well as NVIDIA RTX A6000.
Software Dependencies No The paper mentions 'Torchmeta [7]' and 'Py Torch' (within the reference for Torchmeta) but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We carefully tune hyperparameters for both methods. λ = 0.001 and α = 10. We run a grid search for the learning rate and choose the learning rate in the set {0.01, 0.005, 0.001}. β in FCSG-M are chosen from the set {0.001, 0.01, 0.1, 0.5, 0.9}. The local update step is set as 50. ... For all methods, the model is trained using a single gradient step with a learning rate of 0.4. The model was evaluated using 3 gradient steps [10]. Then we use grid search and carefully tune other hyper-parameters for each method. We choose the learning rate from the set {0.1, 0.05, 0.01} and η as 1 [11]. We select the inner state momentum coefficient for Local-SCGD and Local-SCGDM from {0.1, 0.5, 0.9} and outside momentum coefficient for Local-SCGDM, FCSG-M and Acc-FCSG-M from {0.1, 0.5, 0.9}.