A Model-Based Method for Minimizing CVaR and Beyond

Authors: Si Yi Meng, Robert M. Gower

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We support this theoretical finding experimentally. We design several experiments to compare, and test the sensitivity of SGM, SPL with only one regularization, that is the updates in Lemma 1 where λθ,t = λα,t = λt, and our proposed SPL+ updates. First we study the sensitivity of the methods to choices of λ when minimizing the CVa R objective (5). We use three different synthetic distributions, similar to the setup of Holland & Haress (2021), where we experiment various combinations of loss functions ℓ( ; z) and data distributions controlled by noise ζ (Table 2).
Researcher Affiliation Academia 1Department of Computer Science, Cornell University, Ithaca, NY, USA 2Center for Computational Mathmatics, Flatiron Institute, New York, NY, USA.
Pseudocode Yes Algorithm 1 SGM: Stochastic subgradient method for CVa R minimization. Algorithm 2 SPL+: Stochastic prox-linear method for CVa R minimization with separate regularization.
Open Source Code No The paper does not provide any statements about open-source code availability or links to a code repository.
Open Datasets Yes we present the same experiment on four real datasets: Year Prediction MSD, E2006-tfidf, (binary) mushrooms and (binary) Covertype, all from the LIBSVM repository (Chang & Lin, 2011).
Dataset Splits No The paper mentions 'training split' and 'test set' but does not specify exact percentages or explicit rules for dataset splitting (e.g., '80/10/10 split', or how 'n' is defined for the training split in relation to the overall dataset size).
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper mentions using 'LIBSVM repository' for datasets and running 'full-batch L-BFGS' for optimization, but it does not specify any software libraries or tools with version numbers.
Experiment Setup Yes For all problems we set the dimension to be d = 10. For regression problems, θgen U([0, 1]d), and for classification (logistic regression) we use θgen U([0, 10]d). We set β = 0.95 for all experiments, and thus have omitted β from all plot descriptions. For initialization, we set α0 U(0, 1) and θ0 N(0, Id) at initialization for all algorithms we compare. They are run for T = 100, 000 iterations using 5 different seeds that control the randomness of initialization and sampling during the course of optimization. We employ a decreasing step size λt = λ/ t+1 for SGM and SPL, while λt,α = λℓ0/ t+1 and λt,θ = λ/ℓ0 t+1 for SPL+. We study the sensitivity of the methods to λ, varied over a logarithmically-spaced grid 10 6, 10 5, . . . , 104, densified around λ = 1 using the extra grid 10 1.5, 10 0.5, . . . , 101.5.