Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

Authors: Spencer Frei, Yuan Cao, Quanquan Gu

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We analyze the properties of gradient descent on convex surrogates for the zero-one loss for the agnostic learning of halfspaces. We show that when a quantity we refer to as the soft margin is wellbehaved a condition satisfied by log-concave isotropic distributions among others minimizers of convex surrogates for the zero-one loss are approximate minimizers for the zero-one loss itself. As standard convex optimization arguments lead to efficient guarantees for minimizing convex surrogates of the zero-one loss, our methods allow for the first positive guarantees for the classification error of halfspaces learned by gradient descent using the binary cross-entropy or hinge loss in the presence of agnostic label noise.
Researcher Affiliation Academia 1Department of Statistics, UCLA 2Department of Computer Science, UCLA.
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks. It focuses on theoretical derivations and proofs.
Open Source Code No The paper does not provide any specific links or statements about the availability of open-source code for the methodology described.
Open Datasets No The paper is theoretical and does not conduct experiments on specific datasets. It refers to a 'joint distribution over (x, y)' and 'samples {(xi, yi)}n i=1 i.i.d. D' as abstract concepts in its theoretical framework, but no concrete public dataset is mentioned.
Dataset Splits No The paper is theoretical and does not conduct experiments on specific datasets, thus no training, validation, or test splits are discussed.
Hardware Specification No The paper is theoretical and does not conduct any computational experiments, thus no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not discuss specific software dependencies with version numbers required for reproducibility, as it does not present empirical experiments.
Experiment Setup No The paper is theoretical and does not describe an experimental setup, hyperparameters, or system-level training settings, as no empirical experiments are conducted.