Distributed Bilevel Optimization with Communication Compression
Authors: Yutong He, Jie Hu, Xinmeng Huang, Songtao Lu, Bin Wang, Kun Yuan
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments show that our compressed bilevel algorithms can achieve 10 reduction in communication overhead without severe performance degradation. ... Our numerical experiments demonstrate that the proposed algorithms can achieve 10 reduction in communicated bits, compared to non-compressed distributed bilevel algorithms, see Fig. 1 and Sec. 7. |
| Researcher Affiliation | Collaboration | 1Peking University 2University of Pennsylvania 3IBM Research 4Zhejiang University 5National Engineering Labratory for Big Data Analytics and Applications 6AI for Science Institute, Beijing, China. |
| Pseudocode | Yes | Algorithm 1 C-SOBA and CM-SOBA ... Algorithm 2 EF-SOBA ... Algorithm 3 MSC Module ... Algorithm 4 CM-SOBA-MSC Algorithm ... Algorithm 5 EF-SOBA-MSC Algorithm |
| Open Source Code | No | The paper does not provide a concrete link or explicit statement about the availability of open-source code for the methodology described. |
| Open Datasets | Yes | We conduct the experiments on MNIST dataset with MLP and CIFAR-10 dataset with CNN. ... For MNIST, we use a 2-layer multilayer perceptron (MLP)... For CIFAR-10, we train the 7-layer Le Net. |
| Dataset Splits | No | The paper refers to "validation data" in the problem formulation (e.g., "optimizes the intermediate representation parameter to obtain better feature representation on validation data") but does not provide specific split information (percentages, sample counts, or explicit methodology) for the validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | The batch size of workers stochastic oracle is 512 for MNIST and 1000 for CIFAR-10. The moving average parameter θ of CM-SOBA and EF-SOBA is 0.1. We optimize the stepsizes for all compared algorithms via grid search, each ranging from [0.001, 0.05, . . . , 0.5], which is summarized in Table 2. |