reproducibilityindex.ai

SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization

Authors: Shuchen Zhu, Boao Kong, Songtao Lu, Xinmeng Huang, Kun Yuan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present experiments to validate our theoretical findings. We first explore how update strategies and network structures influence the convergence of SPARKLE. Then we compare SPARKLE to the existing decentralized SBO algorithms. Additional experiments about a decentralized SBO problem with synthetic data are in Appendix D.1.
Researcher Affiliation	Collaboration	Shuchen Zhu Peking University shuchenzhu@stu.pku.edu.cn Boao Kong Peking University kongboao@stu.pku.edu.cn Songtao Lu IBM Research songtao@ibm.com Xinmeng Huang University of Pennsylvania xinmengh@sas.upenn.edu Kun Yuan Peking University kunyuan@pku.edu.cn
Pseudocode	Yes	Algorithm 1 SPARKLE: A unified framework for decentralized stochastic bilevel optimization
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	The Fashion MNIST dataset consists of 60000 images for training and 10000 images for testing and we randomly split 50000 training images into a training set and the other 10000 images into a validation set.
Dataset Splits	Yes	The Fashion MNIST dataset consists of 60000 images for training and 10000 images for testing and we randomly split 50000 training images into a training set and the other 10000 images into a validation set.
Hardware Specification	Yes	All experiments described in this section were run on an NVIDIA A100 server.
Software Dependencies	No	The paper mentions general software environments or tools (e.g., 'two-layer MLP network', 'four-layer CNN') but does not specify particular software packages with version numbers (e.g., Python 3.x, TensorFlow 2.x, PyTorch 1.x).
Experiment Setup	Yes	The batch size is set to 50. The step-sizes for all the algorithms are set to αk = βk = γk = 0.03 and the term η in MDBO is set to 0.5. The moving-average term θk = 0.2.