reproducibilityindex.ai

On the Convergence Theory for Hessian-Free Bilevel Algorithms

Authors: Daouda Sow, Kaiyi Ji, Yingbin Liang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we demonstrate that the proposed algorithms outperform baseline bilevel optimizers on various bilevel problems. Particularly, in our experiment on few-shot meta-learning with Res Net-12 network over the mini Image Net dataset, we show that our algorithm outperforms baseline meta-learning algorithms, while other baseline bilevel optimizers do not solve such meta-learning problems within a comparable time frame. 4 Experiments We validate our algorithms in four bilevel problems: shallow hyper-representation (HR) with linear/2-layer net embedding model on synthetic data, deep HR with Le Net network [32] on MNIST dataset, few-shot meta-learning with Res Net-12 on mini Image Net dataset, and hyperparameter optimization (HO) on the 20 Newsgroup dataset.
Researcher Affiliation	Academia	Daouda A. Sow Department of ECE The Ohio State University sow.53@osu.edu Kaiyi Ji Department of CSE University at Buffalo kaiyiji@buffalo.edu Yingbin Liang Department of ECE The Ohio State University liang.889@osu.edu
Pseudocode	Yes	Algorithm 1 Partial Zeroth-Order-like Bilevel Optimizer (PZOBO)
Open Source Code	No	The paper does not include an explicit statement or link to its own open-source code for the methodology described.
Open Datasets	Yes	few-shot meta-learning with Res Net-12 network over the mini Image Net dataset, deep HR with Le Net network [32] on MNIST dataset, hyperparameter optimization (HO) on the 20 Newsgroup dataset.
Dataset Splits	Yes	where X2 Rn2 m and X1 Rn1 m are matrices of synthesized training and validation data, and Y2 Rn2, Y1 Rn1 are the corresponding response vectors. and hyperparameter optimization (HO) is the problem of finding the set of the best hyperparamters (either representational or regularization parameters) that yield the optimal value of some criterion of model quality (usually a validation loss on unseen data).
Hardware Specification	Yes	We run all models using a single NVIDIA Tesla P100 GPU.
Software Dependencies	No	The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, PyTorch/TensorFlow versions, or specific library versions).
Experiment Setup	Yes	We compare our PZOBO algorithm with the baseline bilevel optimizers AID-FP, AID-CG, ITD-R, and HOZOG (see Appendix E.1 for details about the baseline algorithms and hyperparameters used). and The dataset and hyperparameter details can be found in Appendix E.4.