Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins
Authors: Spencer Frei, Yuan Cao, Quanquan Gu
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We analyze the properties of gradient descent on convex surrogates for the zero-one loss for the agnostic learning of halfspaces. We show that when a quantity we refer to as the soft margin is wellbehaved a condition satisfied by log-concave isotropic distributions among others minimizers of convex surrogates for the zero-one loss are approximate minimizers for the zero-one loss itself. As standard convex optimization arguments lead to efficient guarantees for minimizing convex surrogates of the zero-one loss, our methods allow for the first positive guarantees for the classification error of halfspaces learned by gradient descent using the binary cross-entropy or hinge loss in the presence of agnostic label noise. |
| Researcher Affiliation | Academia | 1Department of Statistics, UCLA 2Department of Computer Science, UCLA. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. It focuses on theoretical derivations and proofs. |
| Open Source Code | No | The paper does not provide any specific links or statements about the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments on specific datasets. It refers to a 'joint distribution over (x, y)' and 'samples {(xi, yi)}n i=1 i.i.d. D' as abstract concepts in its theoretical framework, but no concrete public dataset is mentioned. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments on specific datasets, thus no training, validation, or test splits are discussed. |
| Hardware Specification | No | The paper is theoretical and does not conduct any computational experiments, thus no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not discuss specific software dependencies with version numbers required for reproducibility, as it does not present empirical experiments. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup, hyperparameters, or system-level training settings, as no empirical experiments are conducted. |