Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures
Authors: Yuan Cao, Quanquan Gu, Mikhail Belkin
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we study this benign overfitting phenomenon of the maximum margin classifier for linear classification problems. Specifically, we consider data generated from sub-Gaussian mixtures, and provide a tight risk bound for the maximum margin linear classifier in the over-parameterized setting. Our results precisely characterize the condition under which benign overfitting can occur in linear classification problems, and improve on previous work. They also have direct implications for over-parameterized logistic regression. |
| Researcher Affiliation | Academia | Yuan Cao Department of Statistics & Actuarial Science Department of Mathematics The University of Hong Kong EMAIL Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA EMAIL Mikhail Belkin Halicio glu Data Science Institute University of California San Diego La Jolla, CA 92093, USA EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link or explicit statement in the main text about the availability of its source code. |
| Open Datasets | No | We consider a model where the feature vectors are generated from a mixture of two sub-Gaussian distributions with means µ and µ and the same covariance matrix Σ. We consider n training data points (xi, yi) generated independently from the above procedure |
| Dataset Splits | No | The paper defines training data generation but does not specify any dataset splits (e.g., train/validation/test percentages or counts). |
| Hardware Specification | No | The paper states 'All experiments can be run very efficiently on a standard PC.' but does not provide specific hardware details (e.g., CPU/GPU model, memory). |
| Software Dependencies | No | The paper does not list any specific software dependencies with version numbers. |
| Experiment Setup | No | As a theoretical paper, it defines a model and assumptions but does not describe an experimental setup with hyperparameters or training configurations for empirical evaluation. |