Boosted CVaR Classification

Authors: Runtian Zhai, Chen Dan, Arun Suggala, J. Zico Kolter, Pradeep Ravikumar

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate our proposed algorithm on four benchmark datasets and show that it achieves higher tail performance than deterministic model training methods.
Researcher Affiliation Academia Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar School of Computer Science Carnegie Mellon University Pittsburgh, PA, USA 15213 {rzhai,cdan,asuggala,zkolter,pradeepr}@cs.cmu.edu
Pseudocode Yes Algorithm 1 (Regularized) α-LPBoost for CVa R Classification; Algorithm 2 α-Ada LPBoost
Open Source Code Yes Code. Codes for this paper can be found at: https://github.com/Runtian Z/boosted_cvar.
Open Datasets Yes We conduct our experiments on four datasets: COMPAS [LMKA16], Celeb A [LLWT15], CIFAR-10 and CIFAR-100 [KH+09].
Dataset Splits Yes On COMPAS we use the training set as the validation set because the dataset is very small. Celeb A has its official train-validation split. On CIFAR-10 and CIFAR-100 we take out 10% of the training samples and use them for validation.
Hardware Specification Yes We train our models with CPU on COMPAS and with one NVIDIA GTX 1080ti GPU on other datasets.
Software Dependencies Yes We solve linear programs with the CVXPY package [DB16, AVDB18], which at its core invokes MOSEK [Ap S21] and ECOS [DCB13] for optimization. [...] MOSEK Ap S. MOSEK Optimizer API for Python. Version 9.2.44, 2021.
Experiment Setup Yes We use a three-layer feed-forward neural network with Re LU activations on COMPAS, a Res Net-18 [HZRS16] on Celeb A, a WRN-28-1 [ZK16] on CIFAR-10 and a WRN-28-10 on CIFAR100. [...] We first warmup the model with a few epochs of ERM, and then train T = 100 base models on COMPAS and CIFAR-10, and T = 50 base models on Celeb A and CIFAR-100 from the warmup model with the sample weights given by the boosting algorithms. [...] For α-Ada LPBoost, we choose η = 1.0 on all datasets which is close to the theoretical optimal value η = p8 log n/T.