Achieving Linear Speedup in Non-IID Federated Bilevel Learning

Authors: Minhui Huang, Dewei Zhang, Kaiyi Ji

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments validate our theoretical results and demonstrate the effectiveness of our proposed method.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, University of California, Davis, USA 2Department of Industrial and Systems Engineering, Ohio State University, Columbus, USA 3Department of Computer Science and Engineering, University at Buffalo, New York, USA.
Pseudocode Yes Algorithm 1 Heterogeneous Distributed Minibatch Bilevel Optimization with Partial Clients Participation
Open Source Code No The paper does not include an unambiguous statement about releasing code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets Yes We fix the client sampling ratio to 10%, and the number of clients to be 100 and sample the dataset in a digit-based manner. In particular, the whole MNIST dataset is split into 10 subsets, where each subset contains all images with the same digit.
Dataset Splits No The paper mentions using the MNIST dataset and describes a digit-based sampling strategy for clients to create heterogeneous data. However, it does not provide specific percentages or counts for training, validation, or test splits, nor does it detail a cross-validation setup.
Hardware Specification Yes All experiments are implemented in Python 3.7 on a Linux server with an Nvidia Ge Force RTX 2080ti GPU.
Software Dependencies No The paper mentions 'Python 3.7' but does not list any specific software libraries, frameworks (like PyTorch or TensorFlow), or solvers with their version numbers, which are necessary for reproducible software dependencies.
Experiment Setup Yes In all experiments, we use a multi-layer perceptron (MLP) with 2 linear layers and 1 Re LU activation layer as our model architecture and focus on the heterogeneous case with non-i.i.d. datasets.