Distributed Bilevel Optimization with Communication Compression

Authors: Yutong He, Jie Hu, Xinmeng Huang, Songtao Lu, Bin Wang, Kun Yuan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments show that our compressed bilevel algorithms can achieve 10 reduction in communication overhead without severe performance degradation. ... Our numerical experiments demonstrate that the proposed algorithms can achieve 10 reduction in communicated bits, compared to non-compressed distributed bilevel algorithms, see Fig. 1 and Sec. 7.
Researcher Affiliation Collaboration 1Peking University 2University of Pennsylvania 3IBM Research 4Zhejiang University 5National Engineering Labratory for Big Data Analytics and Applications 6AI for Science Institute, Beijing, China.
Pseudocode Yes Algorithm 1 C-SOBA and CM-SOBA ... Algorithm 2 EF-SOBA ... Algorithm 3 MSC Module ... Algorithm 4 CM-SOBA-MSC Algorithm ... Algorithm 5 EF-SOBA-MSC Algorithm
Open Source Code No The paper does not provide a concrete link or explicit statement about the availability of open-source code for the methodology described.
Open Datasets Yes We conduct the experiments on MNIST dataset with MLP and CIFAR-10 dataset with CNN. ... For MNIST, we use a 2-layer multilayer perceptron (MLP)... For CIFAR-10, we train the 7-layer Le Net.
Dataset Splits No The paper refers to "validation data" in the problem formulation (e.g., "optimizes the intermediate representation parameter to obtain better feature representation on validation data") but does not provide specific split information (percentages, sample counts, or explicit methodology) for the validation set.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes The batch size of workers stochastic oracle is 512 for MNIST and 1000 for CIFAR-10. The moving average parameter θ of CM-SOBA and EF-SOBA is 0.1. We optimize the stepsizes for all compared algorithms via grid search, each ranging from [0.001, 0.05, . . . , 0.5], which is summarized in Table 2.