reproducibilityindex.ai

Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity

Authors: Xuxing Chen, Minhui Huang, Shiqian Ma, Krishna Balasubramanian

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on several machine learning problems. Our numerical results show the efficiency of our algorithm in both the synthetic and the real-world problems.
Researcher Affiliation	Academia	1Department of Mathematics, University of California, Davis, USA 2Department of Electrical and Computer Engineering, University of California, Davis, USA 3Department of Computational Applied Mathematics and Operations Research, Rice University, Houston, USA 4Department of Statistics, University of California, Davis, USA.
Pseudocode	Yes	Algorithm 1 Hessian-Inverse-Gradient Product oracle... Algorithm 2 Hypergradient Estimation... Algorithm 3 MA-DSBO Algorithm
Open Source Code	No	The paper does not provide a direct link to open-source code for the methodology described, nor an explicit statement about its release in supplementary material or an external repository.
Open Datasets	Yes	Now we consider hyperparameter optimization on MNIST dataset (Le Cun et al., 1998).
Dataset Splits	No	The paper mentions 'training and validation set' but does not specify the splits (percentages or counts) used for these sets, or how they were derived.
Hardware Specification	No	The paper mentions 'All the experiments are performed on a local device with 8 cores (n = 8)', but does not specify the CPU model, GPU model, or other detailed hardware specifications.
Software Dependencies	Yes	All the experiments are performed on a local device with 8 cores (n = 8) using mpi4py (Dalcin & Fang, 2021) for parallel computing and Py Torch (Paszke et al., 2019) for computing stochastic oracles.
Experiment Setup	Yes	We include the numerical results of different stepsize choices in Figure 2. Note that in previous algorithms (Chen et al., 2022b; Yang et al., 2022) one Hessian matrix of the lower level function requires O(c2p2) storage, while in our algorithm a Hessian-vector product only requires O(cp) storage, which improves both the space and the communication complexity.