Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Achieving $\mathcal{O}(\epsilon^{-1.5})$ Complexity in Hessian/Jacobian-free Stochastic Bilevel Optimization
Authors: Yifan Yang, Peiyao Xiao, Kaiyi Ji
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Theoretically, we show that Fde HBO requires O(ϵ 1.5) iterations (each using O(1) samples and only first-order gradient information) to find an ϵ-accurate stationary point. As far as we know, this is the first Hessian/Jacobian-free method with an O(ϵ 1.5) sample complexity for nonconvex-strongly-convex stochastic bilevel optimization. In this section, we test the performance of the proposed Fde HBO and FMBO on two applications: hyper-representation and data hyper-cleaning, respectively. As shown in Figure 1, our Fde HBO converges much faster and more stably than PZOBO-S, F2SA and F3SA, while achieving a higher training accuracy. |
| Researcher Affiliation | Academia | Yifan Yang, Peiyao Xiao and Kaiyi Ji Department of Computer Science and Engineering University at Buffalo Buffalo, NY 14260 EMAIL |
| Pseudocode | Yes | Algorithm 1 Hessian/Jacobian-free Bilevel Optimizer via Projection-aided Finite-difference Estimation; Algorithm 2 Fully Single-loop Momentum-based Bilevel Optimizer (FMBO) |
| Open Source Code | No | The paper does not contain any explicit statement about making the source code available or provide a link to a code repository for the implemented methods. |
| Open Datasets | Yes | 4.1 Hyper-representation on MNIST Dataset; 4.2 Hyper-cleaning on MNIST Dataset |
| Dataset Splits | Yes | Sν and Sτ denote the training data and validation data, whose sizes are set to 20000 and 5000, respectively |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks). |
| Experiment Setup | Yes | We perform the hyper-representation with the 7-layer Le Net network [38]... whose sizes are set to 20000 and 5000, respectively, λ = {λi}i Sτ and C are the regularization parameters... More details of the experimental setups are specified in Appendix A.1. |