Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem
Authors: Jincheng Cao, Ruichen Jiang, Nazanin Abolfazli, Erfan Yazdandoost Hamedani, Aryan Mokhtari
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we test our methods on two different stochastic bilevel optimization problems with real and synthetic datasets and compare them with other existing stochastic methods in [16] and [13]. In Figure 1(a)(b), we observe that SBCGF maintains a smaller lower-level gap than other methods and converges faster than the rest in terms of upper-level error. |
| Researcher Affiliation | Academia | Jincheng Cao ECE Department UT Austin jinchengcao@utexas.edu Ruichen Jiang ECE Department UT Austin rjiang@utexas.edu Nazanin Abolfazli SIE Department The University of Arizona nazaninabolfazli@arizona.edu Erfan Yazdandoost Hamedani SIE Department The University of Arizona erfany@arizona.edu Aryan Mokhtari ECE Department UT Austin mokhtari@austin.utexas.edu |
| Pseudocode | Yes | Algorithm 1 SBCGI |
| Open Source Code | No | The paper mentions using third-party tools like 'CVX [41, 42]' and 'MATLAB s root-finding solver', but there is no explicit statement or link indicating that the authors have made their own implementation code for the described methodology publicly available. |
| Open Datasets | Yes | We apply the Wikipedia Math Essential dataset [30] which composes of a data matrix A Rn d with n = 1068 samples and d = 730 features and an output vector b Rn. [30] Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, Alexander Riedel, Maria Astefanoaei, Oliver Kiss, Ferenc Beres, Guzmán López, Nicolas Collignon, et al. Pytorch geometric temporal: Spatiotemporal signal processing with neural machine learning models. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 4564 4573, 2021. |
| Dataset Splits | Yes | To ensure the problem is over-parameterized, we assign 1/3 of the dataset as the training set (Atr, btr), 1/3 as the validation set (Aval , bval ) and the remaining 1/3 as the test set (Atest , btest ). |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'CVX [41, 42]' and 'MATLAB s root-finding solver' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We query stochastic oracle 9 105 times with stepsize γt = 0.01/(t + 1) and γ = 10 5 for SBCGI 1 and SBCGF 2 with Kt = 10 4/ t + 1, respectively. For a R-IP-Se G, we choose γ0 = 10 7 and ρ0 = 103. For DBGD, we set α = β = 1 and γt = 10 6. |