Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Convergence Theory for Hessian-Free Bilevel Algorithms

Authors: Daouda Sow, Kaiyi Ji, Yingbin Liang

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we demonstrate that the proposed algorithms outperform baseline bilevel optimizers on various bilevel problems. Particularly, in our experiment on few-shot meta-learning with Res Net-12 network over the mini Image Net dataset, we show that our algorithm outperforms baseline meta-learning algorithms, while other baseline bilevel optimizers do not solve such meta-learning problems within a comparable time frame. 4 Experiments We validate our algorithms in four bilevel problems: shallow hyper-representation (HR) with linear/2-layer net embedding model on synthetic data, deep HR with Le Net network [32] on MNIST dataset, few-shot meta-learning with Res Net-12 on mini Image Net dataset, and hyperparameter optimization (HO) on the 20 Newsgroup dataset.
Researcher Affiliation Academia Daouda A. Sow Department of ECE The Ohio State University EMAIL Kaiyi Ji Department of CSE University at Buffalo EMAIL Yingbin Liang Department of ECE The Ohio State University EMAIL
Pseudocode Yes Algorithm 1 Partial Zeroth-Order-like Bilevel Optimizer (PZOBO)
Open Source Code No The paper does not include an explicit statement or link to its own open-source code for the methodology described.
Open Datasets Yes few-shot meta-learning with Res Net-12 network over the mini Image Net dataset, deep HR with Le Net network [32] on MNIST dataset, hyperparameter optimization (HO) on the 20 Newsgroup dataset.
Dataset Splits Yes where X2 Rn2 m and X1 Rn1 m are matrices of synthesized training and validation data, and Y2 Rn2, Y1 Rn1 are the corresponding response vectors. and hyperparameter optimization (HO) is the problem of finding the set of the best hyperparamters (either representational or regularization parameters) that yield the optimal value of some criterion of model quality (usually a validation loss on unseen data).
Hardware Specification Yes We run all models using a single NVIDIA Tesla P100 GPU.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, PyTorch/TensorFlow versions, or specific library versions).
Experiment Setup Yes We compare our PZOBO algorithm with the baseline bilevel optimizers AID-FP, AID-CG, ITD-R, and HOZOG (see Appendix E.1 for details about the baseline algorithms and hyperparameters used). and The dataset and hyperparameter details can be found in Appendix E.4.