SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Authors: Shuchen Zhu, Boao Kong, Songtao Lu, Xinmeng Huang, Kun Yuan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present experiments to validate our theoretical findings. We first explore how update strategies and network structures influence the convergence of SPARKLE. Then we compare SPARKLE to the existing decentralized SBO algorithms. Additional experiments about a decentralized SBO problem with synthetic data are in Appendix D.1. |
| Researcher Affiliation | Collaboration | Shuchen Zhu Peking University shuchenzhu@stu.pku.edu.cn Boao Kong Peking University kongboao@stu.pku.edu.cn Songtao Lu IBM Research songtao@ibm.com Xinmeng Huang University of Pennsylvania xinmengh@sas.upenn.edu Kun Yuan Peking University kunyuan@pku.edu.cn |
| Pseudocode | Yes | Algorithm 1 SPARKLE: A unified framework for decentralized stochastic bilevel optimization |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | The Fashion MNIST dataset consists of 60000 images for training and 10000 images for testing and we randomly split 50000 training images into a training set and the other 10000 images into a validation set. |
| Dataset Splits | Yes | The Fashion MNIST dataset consists of 60000 images for training and 10000 images for testing and we randomly split 50000 training images into a training set and the other 10000 images into a validation set. |
| Hardware Specification | Yes | All experiments described in this section were run on an NVIDIA A100 server. |
| Software Dependencies | No | The paper mentions general software environments or tools (e.g., 'two-layer MLP network', 'four-layer CNN') but does not specify particular software packages with version numbers (e.g., Python 3.x, TensorFlow 2.x, PyTorch 1.x). |
| Experiment Setup | Yes | The batch size is set to 50. The step-sizes for all the algorithms are set to αk = βk = γk = 0.03 and the term η in MDBO is set to 0.5. The moving-average term θk = 0.2. |