reproducibilityindex.ai

First-Order Methods for Linearly Constrained Bilevel Optimization

Authors: Guy Kornowski, Swati Padmanabhan, Kai Wang, Zhe Zhang, Suvrit Sra

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our numerical experiments verify these guarantees.
Researcher Affiliation	Academia	Weizmann Institute of Science. guy.kornowski@weizmann.ac.il Massachusetts Institute of Technology. pswt@mit.edu Georgia Institute of Technology. kwang692@gatech.edu Purdue University. zhan5111@purdue.edu Massachusetts Institute of Technology. suvrit@mit.edu
Pseudocode	Yes	Algorithm 1 Inexact Gradient Oracle for Bilevel Program with Linear Equality Constraint
Open Source Code	Yes	The implementation can be found in https://github.com/guaguakai/constrained-bilevel-optimization.
Open Datasets	No	We generate instances of the following constrained bilevel optimization problem: minx c y + 0.01 x 2 + 0.01 y 2 s.t. y = arg miny:h(x,y) 0 1 2y Qy + x Py, (6.1) where h(x, y) = Ay b is a dh-dim linear constraint. The PSD matrix Q Rdy dy, c Rdy, P Rdx dy, and constraints A Rdh dy, b Rdh are randomly generated from normal distributions (cf. Appendix K).
Dataset Splits	No	The paper describes generating synthetic data instances but does not specify any training, validation, or test dataset splits, percentages, or absolute sample counts.
Hardware Specification	Yes	All experiments were run on a computing cluster with Dual Intel Xeon Gold 6226 CPUs @ 2.7 GHz and DDR4-2933 MHz DRAM. No GPU was used, and we used 1 core with 8GB RAM per instance of the experiment.
Software Dependencies	No	All algorithms are implemented in Py Torch [94] to compute gradients, and using Cvxpy [95] to solve the LL problem and the penalty minimization problem. (No version numbers provided for PyTorch or CVXPY)
Experiment Setup	Yes	Both algorithms use Adam [90] to control the learning rate, and are averaged over 10 random seeds. We run Algorithm 3 using Algorithm 4 on the bilevel optimization in the toy example in Problem L.1 with dx = 100, dy = 200, nconst = dy/5, and accuracy α = 0.1. The cutoff time for running the algorithms is set to be 6 hours.