Sample-Optimal PAC Learning of Halfspaces with Malicious Noise
Authors: Jie Shen
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study efficient PAC learning of homogeneous halfspaces in Rd in the presence of malicious noise of Valiant (1985). This is a challenging noise model and only until recently has nearoptimal noise tolerance bound been established under the mild condition that the unlabeled data distribution is isotropic log-concave. However, it remains unsettled how to obtain the optimal sample complexity simultaneously. In this work, we present a new analysis for the algorithm of Awasthi et al. (2017) and show that it essentially achieves the near-optimal sample complexity bound of O(d), improving the best known result of O(d2). Our main ingredient is a novel incorporation of a matrix Chernoff-type inequality to bound the spectrum of an empirical covariance matrix for well-behaved distributions, in conjunction with a careful exploration of the localization schemes of Awasthi et al. (2017). We further extend the algorithm and analysis to the more general and stronger nasty noise model of Bshouty et al. (2002), showing that it is still possible to achieve near-optimal noise tolerance and sample complexity in polynomial time. |
| Researcher Affiliation | Academia | Stevens Institute of Technology, Hoboken, New Jersey, USA. Correspondence to: Jie Shen <jie.shen@stevens.edu>. |
| Pseudocode | Yes | Algorithm 1 Efficient and Sample-Optimal Algorithm Tolerating Malicious Noise; Algorithm 3 Efficient and Sample-Optimal Algorithm Tolerating Nasty Noise |
| Open Source Code | No | The paper does not provide any statement or link indicating the release of open-source code for the described methodology. |
| Open Datasets | No | The paper focuses on theoretical analysis under 'isotropic log-concave distributions' and does not mention using or providing access information for a specific publicly available dataset for training. |
| Dataset Splits | No | The paper is theoretical and does not discuss empirical dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not report on empirical experiments, therefore no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not report on empirical experiments, therefore no specific software dependencies with version numbers are listed. |
| Experiment Setup | No | The paper is theoretical and does not report on empirical experiments, therefore no specific experimental setup details or hyperparameters are provided. |