Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization

Authors: Jincheng Cao, Ruichen Jiang, Erfan Yazdandoost Hamedani, Aryan Mokhtari

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Numerical Experiments While the primary focus of this work is theoretical, we include a set of numerical experiments to illustrate the behavior of the proposed algorithm and to support our theoretical findings. Since prior studies [3, 5] have already demonstrated the strong empirical performance of DBGD in large-scale neural network training tasks, we do not repeat such experiments here. Instead, we evaluate DBGD on deterministic optimization problems, which align more closely with the scope of this paper.
Researcher Affiliation	Collaboration	Jincheng Cao UT Austin EMAIL Ruichen Jiang UT Austin EMAIL Erfan Yazdandoost Hamedani The University of Arizona EMAIL Aryan Mokhtari UT Austin & Google Research EMAIL
Pseudocode	No	The paper describes the algorithm using mathematical formulas and text in Section 4 "Algorithmic Framework", for example: "xk+1 = xk ηkdk, (11) where ηk > 0 is a step size and dk is a descent direction." and "dk = f(xk) + λk g(xk), (13) where λk can be computed as follows: λk = max {ϕ(xk) f(xk) g(xk) / \|\|g(xk)\|\|2 , 0} (14)", but does not present it in a formally labeled pseudocode or algorithm block.
Open Source Code	Yes	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The code and data are attached in the supplementary material.
Open Datasets	No	Toy Example. We study the following nonconvex simple bilevel problem, min x R2 (x1 + π/20)2 + (x2 + 1)2 s.t. x argmin z R2 (z2 sin(10z1))2 (24) ... Matrix Factorization. For Problem (25), we set n = r = 10 to generate U and construct M = U U + ϵIn, where ϵ N(0, 0.01) and In Rn n denotes the identity matrix.
Dataset Splits	No	Toy Example. We study the following nonconvex simple bilevel problem, min x R2 (x1 + π/20)2 + (x2 + 1)2 s.t. x argmin z R2 (z2 sin(10z1))2 (24) ... Matrix Factorization. For Problem (25), we set n = r = 10 to generate U and construct M = U U + ϵIn, where ϵ N(0, 0.01) and In Rn n denotes the identity matrix. The paper describes generating its own data for the numerical experiments, not requiring train/test/validation splits typically seen in ML datasets.
Hardware Specification	Yes	All simulations are implemented using MATLAB R2022a on a PC running mac OS Sonoma with an Apple M1 Pro chip and 16GB Memory.
Software Dependencies	Yes	All simulations are implemented using MATLAB R2022a on a PC running mac OS Sonoma with an Apple M1 Pro chip and 16GB Memory.
Experiment Setup	Yes	Toy Example. ...initialized at the point x0 = (-3, 1), using a base stepsize of η = 10^-2 and a total of K = 10^3 iterations. Since the penalty methods become unstable for large values of λ, we further scale the stepsize by a factor of 1/(1 + λ) in each independent run. ... Matrix Factorization. For Problem (25), we set n = r = 10 ... Both methods use a stepsize of η = 10^-5 and are run for K = 10^6 iterations. ... The hyperparameter α in both f1 and f2 is set to 1.