A Model-Based Method for Minimizing CVaR and Beyond
Authors: Si Yi Meng, Robert M. Gower
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We support this theoretical finding experimentally. We design several experiments to compare, and test the sensitivity of SGM, SPL with only one regularization, that is the updates in Lemma 1 where λθ,t = λα,t = λt, and our proposed SPL+ updates. First we study the sensitivity of the methods to choices of λ when minimizing the CVa R objective (5). We use three different synthetic distributions, similar to the setup of Holland & Haress (2021), where we experiment various combinations of loss functions ℓ( ; z) and data distributions controlled by noise ζ (Table 2). |
| Researcher Affiliation | Academia | 1Department of Computer Science, Cornell University, Ithaca, NY, USA 2Center for Computational Mathmatics, Flatiron Institute, New York, NY, USA. |
| Pseudocode | Yes | Algorithm 1 SGM: Stochastic subgradient method for CVa R minimization. Algorithm 2 SPL+: Stochastic prox-linear method for CVa R minimization with separate regularization. |
| Open Source Code | No | The paper does not provide any statements about open-source code availability or links to a code repository. |
| Open Datasets | Yes | we present the same experiment on four real datasets: Year Prediction MSD, E2006-tfidf, (binary) mushrooms and (binary) Covertype, all from the LIBSVM repository (Chang & Lin, 2011). |
| Dataset Splits | No | The paper mentions 'training split' and 'test set' but does not specify exact percentages or explicit rules for dataset splitting (e.g., '80/10/10 split', or how 'n' is defined for the training split in relation to the overall dataset size). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for the experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper mentions using 'LIBSVM repository' for datasets and running 'full-batch L-BFGS' for optimization, but it does not specify any software libraries or tools with version numbers. |
| Experiment Setup | Yes | For all problems we set the dimension to be d = 10. For regression problems, θgen U([0, 1]d), and for classification (logistic regression) we use θgen U([0, 10]d). We set β = 0.95 for all experiments, and thus have omitted β from all plot descriptions. For initialization, we set α0 U(0, 1) and θ0 N(0, Id) at initialization for all algorithms we compare. They are run for T = 100, 000 iterations using 5 different seeds that control the randomness of initialization and sampling during the course of optimization. We employ a decreasing step size λt = λ/ t+1 for SGM and SPL, while λt,α = λℓ0/ t+1 and λt,θ = λ/ℓ0 t+1 for SPL+. We study the sensitivity of the methods to λ, varied over a logarithmically-spaced grid 10 6, 10 5, . . . , 104, densified around λ = 1 using the extra grid 10 1.5, 10 0.5, . . . , 101.5. |