Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
High Probability and Risk-Averse Guarantees for a Stochastic Accelerated Primal-Dual Method
Authors: Yassine Laguel, Necdet Serhat Aybat, Mert Gürbüzbalaban
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we illustrate the robustness properties of SAPD when solving bilinear games and distributionally robust learning problems involving both synthetic and real data. First we consider the regularized bilinear game presented in (4.1), min x PRd max y PRd µx 2 }x}2 x JKy µy 2 }y}2, for K fi 10 K{ K , K fip M MJq{2, where M p Mi,jq is a 30 ˆ 30 matrix with entries sampled from i.i.d standard normal variables. We set the regularization variables as µx µy 1. We explore two values of the momentum parameter θ as θ and 1 p1 θq2, with θ fip1 κ2 maxq1{2 1 computed based on the threshold value from Theorem 17. We then determine the stepsizes τ, σ according to the CP parameterization (2.3) where ρ θ. Finally, SAPD is initialized at a random tuple px0, y0q 50p x0, y0q, where x0, y0 P R30 have entries sampled from i.i.d. standard normal distributions. In Figure 4, we report the histogram of the distance squared En }xn x }2 }yn y }2 to the saddle point z 0 after n 2000 (top, middle panel) and n 5000 iterations (top, right panel) based on 500 sample paths and for both choice of (momentum) parameter values. The expected distance Er Ens over iterations is also reported on the top, left panel along with the error bars around it. The continuous vertical line in the convergence plots represents the sample average (estimating the expectation Er Ens), while the dashed vertical line represents the p-quantile of En for p 0.90, i.e., the 90th percentile of the error En. We observe that the performance is sensitive to the choice of parameters and there are bias/risk trade-offs in the choice of parameters; indeed, when the number of steps is smaller (for n 2000), the noise accumulation is not dominant and a smaller rate parameter ρ θ allows faster decay of the initialization bias, resulting in better guarantees for the value at risk with p 0.90 or equivalently for the 90-th quantile. On the other hand when the number of steps is larger (for n 5000), there is more risk associated to accumulation of noise and a larger choice of ρ θ close to 1 is preferable, as this results in smaller primal and dual stepsizes which allows to control the tail risk at the expense of a slower decay of the initialization bias. Next, we aim to solve the following distributionally robust logistic regression problem introduced in (Zhang et al., 2024): minx PRd maxy PPr µx i 1 yiϕipxq µy 2 }y}2, where ϕipxq fi logp1 expp bia J i xqq, and Pr fity P R m :1Jy 1, }y 1{m}2 ď r m2 u, with r 2?m. We consider two datasets from the UCI Repository6, Dry Bean, and Arcene, and follow the preprocessing protocol outlined in (Zhang et al., 2024). For each dataset, we run SAPD with two values θ1, θ2 that are greater than the threshold value θ given in (Zhang et al., 2024, Corollary 1). SAPD is initialized for both datasets at x0 r2, . . . , 2s and y0 1{m. In the middle and bottom panels of Figure 4, we display the average of the error En over the course of the iterations as well as the error histogram for SAPD over 500 runs as we did in the previous experiment. Our numerical findings are similar to the bilinear case, i.e., to obtain the best risk guarantees, one needs to choose the algorithm parameters in a careful fashion which is inline with our theoretical results, where obtaining the accelerated iteration complexity in Corollary 12 requires choosing the parameters in an optimized fashion over the class of admissible CP parameters. However, our parameter choice in Corollary 12 optimizes the complexity bounds in the worst-case, i.e., these bounds apply to any SCSC problem; moreover, these worst-case bounds involve some universal Op1q constants that are not fully optimized in our complexity results. In practice choosing θ specific to the problem at hand is beneficial. Indeed, a practical alternative to our θ choice in Corollary 12 would be to implement a grid search on θ and to set τ and σ according to the Chambolle-Pock parameterization in (2.3) for each θ choice in the grid. |
| Researcher Affiliation | Academia | Yassine Laguel EMAIL Laboratoire Jean Alexandre Dieudonné Université Côte d Azur Nice, France. Necdet Serhat Aybat EMAIL Department of Industrial and Manufacturing Engineering Pennsylvania State University University Park, PA, USA Mert Gürbüzbalaban EMAIL Department of Management Science and Information Systems Rutgers Business School, Rutgers University. Piscataway, NJ, USA. |
| Pseudocode | Yes | Algorithm 1 SAPD Algorithm Require: Parameters τ, σ, θ. Starting point px0, y0q. Horizon N. 1: Initialize: x 1 Ð x0, y 1 Ð y0, q0 Ð 0 2: for n 0, . . . , N 1 do 3: sn Ð y Φpxn, yn, ωy nq θ qn Ź Momentum averaging 4: yn 1 Ð proxσgpyn σ snq 5: xn 1 Ð proxτfpxn τ x Φpxn, yn 1, ωx nqq 6: qn 1 Ð y Φpxn 1, yn 1, ωy n 1q y Φpxn, yn, ωy nq return z N px N, y Nq |
| Open Source Code | No | No explicit statement or link to the source code for the methodology described in this paper is provided. The text mentions "online-only supplementary material see the extended version of the paper Laguel et al. (2023)" but does not explicitly state that code is provided there or provide a direct link for this paper. |
| Open Datasets | Yes | Next, we aim to solve the following distributionally robust logistic regression problem introduced in (Zhang et al., 2024): minx PRd maxy PPr µx i 1 yiϕipxq µy 2 }y}2, where ϕipxq fi logp1 expp bia J i xqq, and Pr fity P R m :1Jy 1, }y 1{m}2 ď r m2 u, with r 2?m. We consider two datasets from the UCI Repository6, Dry Bean, and Arcene, and follow the preprocessing protocol outlined in (Zhang et al., 2024). For each dataset, we run SAPD with two values θ1, θ2 that are greater than the threshold value θ given in (Zhang et al., 2024, Corollary 1). SAPD is initialized for both datasets at x0 r2, . . . , 2s and y0 1{m. |
| Dataset Splits | No | The paper mentions using |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processors, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not provide any specific software dependencies with version numbers. |
| Experiment Setup | Yes | In this section, we illustrate the robustness properties of SAPD when solving bilinear games and distributionally robust learning problems involving both synthetic and real data. First we consider the regularized bilinear game presented in (4.1), min x PRd max y PRd µx 2 }x}2 x JKy µy 2 }y}2, for K fi 10 K{ K , K fip M MJq{2, where M p Mi,jq is a 30 ˆ 30 matrix with entries sampled from i.i.d standard normal variables. We set the regularization variables as µx µy 1. We explore two values of the momentum parameter θ as θ and 1 p1 θq2, with θ fip1 κ2 maxq1{2 1 computed based on the threshold value from Theorem 17. We then determine the stepsizes τ, σ according to the CP parameterization (2.3) where ρ θ. Finally, SAPD is initialized at a random tuple px0, y0q 50p x0, y0q, where x0, y0 P R30 have entries sampled from i.i.d. standard normal distributions. [...] SAPD is initialized for both datasets at x0 r2, . . . , 2s and y0 1{m. |