Agnostic Learnability of Halfspaces via Logistic Loss
Authors: Ziwei Ji, Kwangjun Ahn, Pranjal Awasthi, Satyen Kale, Stefani Karp
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We investigate approximation guarantees provided by logistic regression for the fundamental problem of agnostic learning of homogeneous halfspaces. In this paper, we close this gap by constructing a well-behaved distribution such that the global minimizer of the logistic risk over this distribution only achieves Ω(OPT) misclassification risk, matching the upper bound in (Frei et al., 2021b). On the other hand, we also show that if we impose a radial-Lipschitzness condition in addition to well-behaved-ness on the distribution, logistic regression on a ball of bounded radius reaches O(OPT) misclassification risk. Our techniques also show for any well-behaved distribution, regardless of radial Lipschitzness, we can overcome the Ω(OPT) lower bound for logistic loss simply at the cost of one additional convex optimization step involving the hinge loss and attain O(OPT) misclassification risk. This two-step convex optimization algorithm is simpler than previous methods obtaining this guarantee, all of which require solving O(log(1/OPT)) minimization problems. |
| Researcher Affiliation | Collaboration | Ziwei Ji 1 Kwangjun Ahn 2 Pranjal Awasthi 3 Satyen Kale 3 Stefani Karp 3 4 ... 1Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA 2Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA 3Google Research, New York City, NY, USA 4School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. |
| Pseudocode | No | The paper describes algorithms (e.g., two-phase algorithm, projected gradient descent) but does not present them in a structured pseudocode block or algorithm environment. |
| Open Source Code | No | No statement or link regarding the public availability of source code for the described methodology was found. |
| Open Datasets | No | The paper constructs a theoretical distribution Q for analysis (Section 2) and discusses properties of abstract distributions (Px) rather than using or providing access to a publicly available dataset for empirical training. |
| Dataset Splits | No | This is a theoretical paper focused on proofs and algorithmic analysis; it does not involve empirical experiments requiring dataset splits for validation. |
| Hardware Specification | No | This is a theoretical paper and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | This is a theoretical paper and does not list specific software dependencies with version numbers for experimental reproduction. |
| Experiment Setup | No | This is a theoretical paper and does not provide specific experimental setup details such as hyperparameters or training configurations. |