Achieving Linear Speedup in Non-IID Federated Bilevel Learning
Authors: Minhui Huang, Dewei Zhang, Kaiyi Ji
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments validate our theoretical results and demonstrate the effectiveness of our proposed method. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of California, Davis, USA 2Department of Industrial and Systems Engineering, Ohio State University, Columbus, USA 3Department of Computer Science and Engineering, University at Buffalo, New York, USA. |
| Pseudocode | Yes | Algorithm 1 Heterogeneous Distributed Minibatch Bilevel Optimization with Partial Clients Participation |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing code for the work described, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We fix the client sampling ratio to 10%, and the number of clients to be 100 and sample the dataset in a digit-based manner. In particular, the whole MNIST dataset is split into 10 subsets, where each subset contains all images with the same digit. |
| Dataset Splits | No | The paper mentions using the MNIST dataset and describes a digit-based sampling strategy for clients to create heterogeneous data. However, it does not provide specific percentages or counts for training, validation, or test splits, nor does it detail a cross-validation setup. |
| Hardware Specification | Yes | All experiments are implemented in Python 3.7 on a Linux server with an Nvidia Ge Force RTX 2080ti GPU. |
| Software Dependencies | No | The paper mentions 'Python 3.7' but does not list any specific software libraries, frameworks (like PyTorch or TensorFlow), or solvers with their version numbers, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | In all experiments, we use a multi-layer perceptron (MLP) with 2 linear layers and 1 Re LU activation layer as our model architecture and focus on the heterogeneous case with non-i.i.d. datasets. |