The Role of Regularization in Classification of High-dimensional Noisy Gaussian Mixture

Authors: Francesca Mignacco, Florent Krzakala, Yue Lu, Pierfrancesco Urbani, Lenka Zdeborova

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a rigorous analysis of the generalization error of regularized convex classifiers, including ridge, hinge and logistic regression, in the highdimensional limit where the number n of samples and their dimension d go to infinity while their ratio is fixed to α = n/d. We discuss surprising effects of the regularization that in some cases allows to reach the Bayes-optimal performances. We also illustrate the interpolation peak at low regularization, and analyze the role of the respective sizes of the two clusters. ... We show through numerical simulations that the formulas are extremely accurate even at moderately small dimensions.
Researcher Affiliation Academia 1Universit e Paris-Saclay, CNRS, CEA, Institut de Physique th eorique, 91191, Gif-sur-Yvette, France 2Laboratoire de physique de l Ecole normale sup erieure, ENS, Universit e PSL, CNRS, Sorbonne Universit e, Universit e de Paris, F-75005, Paris, France 3John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper refers to using a standard third-party package for simulations ("The points are results of simulations with a standard scikitlearn (Pedregosa et al., 2011) package."), but it does not state that the authors are releasing their own code for the methodology described in the paper.
Open Datasets No The paper studies a "mixture of two Gaussian clusters in d-dimensions", which is a synthetic data model. It defines the data generation process in equation (1). It does not use or provide concrete access information for a publicly available or open dataset in the traditional sense.
Dataset Splits No The paper mentions "k-fold cross validation" as a method to optimize a parameter (bias) but does not provide specific details on training/validation/test dataset splits for the experiments reported, such as percentages, sample counts, or citations to predefined splits.
Hardware Specification No The paper mentions "We thank Google Cloud for providing us access to their platform through the Research Credits Application program." However, it does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions "simulations with a standard scikitlearn (Pedregosa et al., 2011) package" but does not specify the version number of scikit-learn or other key software components used in the experiments.
Experiment Setup Yes We focus on ridge regularized learning performed by the empirical risk minimization of the loss: ... where λ is the tunable strength of the regularization. ... We depict (in green) the performance of the non-regularized logistic loss a.k.a. the maximum likelihood. For α > α (ρ, ) the training data are not linearly separable and the minimum training loss is bounded away from zero. ... In the left part of Fig. 1 we depict (in green) the performance of the non-regularized logistic loss a.k.a. the maximum likelihood. ... The left panel of Fig. 1 is for the symmetric case ρ=0.5, the right panel for the non-symmetric case ρ = 0.2. ... In Fig. 2 we depict the dependence of the generalization error on the regularization λ for the symmetric ρ = 0.5 case for the square, hinge and logistic loss.