An Alternating Optimization Method for Bilevel Problems under the Polyak-Łojasiewicz Condition
Authors: Quan Xiao, Songtao Lu, Tianyi Chen
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our stationary measure is a necessary condition of the global optimality. As shown in Figure 1, GALET approaches the global optimal set of Example 1 and our stationary measure also converges to 0, while the value-function based KKT score does not. ... We compare GALET with BOME [37], IAPTT-GM [43] and V-PBGD [57] in the hyper-cleaning task on the MNIST dataset. As shown in Figure 5, GALET converges faster than other methods and the convergence rate of GALET is O(1/K), which matches Theorem 2. Table 2 shows that the test accuracy of GALET is comparable to other methods. |
| Researcher Affiliation | Collaboration | Quan Xiao Rensselaer Polytechnic Institute Troy, NY, USA xiaoq5@rpi.edu Songtao Lu IBM Research Yorktown Heights, NY, USA songtao@ibm.com Tianyi Chen Rensselaer Polytechnic Institute Troy, NY, USA chentianyi19@gmail.com |
| Pseudocode | Yes | Algorithm 1 GALET for nonconvex-PL BLO |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the methodology described. |
| Open Datasets | Yes | We compare GALET with BOME [37], IAPTT-GM [43] and V-PBGD [57] in the hyper-cleaning task on the MNIST dataset. ... We compare our method with the existing methods on the data hyper-cleaning task using the MNIST and the Fashion MNIST dataset [21]. |
| Dataset Splits | Yes | We are given 5000 training data with corruption rate 0.5, 5000 clean validation data and 10000 clean testing data. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for the experiments. |
| Software Dependencies | No | The paper mentions computing the Hessian-vector product via an efficient method [53] and using 'auto differentiate', but does not specify any software names with version numbers. |
| Experiment Setup | Yes | Parameter choices. The dimension of the hidden layer of MLP model is set as 50. We select the stepsize from α {1, 10, 50, 100, 200, 500}, γ {0.1, 0.3, 0.5, 0.8} and β {0.001, 0.005, 0.01, 0.05, 0.1}, while the number of loops is chosen from T {5, 10, 20, 30, 50} and N {5, 10, 30, 50, 80}. The default choice of parameter is α = 0.3, K = 30, β = 1, N = 1, γ = 0.1, T = 1. |