Efficient active learning of sparse halfspaces with arbitrary bounded noise
Authors: Chicheng Zhang, Jie Shen, Pranjal Awasthi
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study active learning of homogeneous s-sparse halfspaces in Rd under the setting where the unlabeled data distribution is isotropic log-concave and each label is flipped with probability at most for a parameter 2 , known as the bounded noise. This is the first efficient algorithm with label complexity polynomial in 1 1 2 in this setting, which is label-efficient even for arbitrarily close to 1 2. Our active learning algorithm and its theoretical guarantees also immediately translate to new state-of-the-art label and sample complexity results for full-dimensional active and passive halfspace learning under arbitrary bounded noise. We develop an attribute-efficient learning algorithm that runs in polynomial time, and achieves a label complexity of O s (1 2 )4 polylog provided that the underlying Bayes classifier is an s-sparse halfspace. Theorem 2 (Main result). Suppose Algorithm 1 is run under a distribution D such that Assumptions 1 and 2 are satisfied. Then with probability 1 δ, it returns a halfspace u such that err(h u, D) err(hu, D) . Moreover, our algorithm tolerates any noise rate 2 [0, 1/2), and asks for a total of O # s (1 2 )4 polylog labeled examples. |
| Researcher Affiliation | Collaboration | Chicheng Zhang University of Arizona chichengz@cs.arizona.edu Jie Shen Stevens Institute of Technology jie.shen@stevens.edu Pranjal Awasthi Google Research and Rutgers University pranjalawasthi@google.com |
| Pseudocode | Yes | Algorithm 1 Main algorithm, Algorithm 2 REFINE, Algorithm 3 INITIALIZE |
| Open Source Code | No | The paper does not mention releasing open-source code or provide any links to a code repository. |
| Open Datasets | No | The paper operates under theoretical assumptions about data distributions (e.g., "isotropic log-concave distribution") and does not refer to specific, named, publicly available datasets for training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental dataset splits for training, validation, or testing, nor does it refer to standard predefined splits for any dataset. |
| Hardware Specification | No | The paper focuses on theoretical algorithm design and complexity analysis. It does not describe any specific hardware (e.g., GPU/CPU models, memory) used to run experiments. |
| Software Dependencies | No | The paper describes algorithms and theoretical guarantees. It does not mention any specific software dependencies or their version numbers that would be needed to reproduce experiments. |
| Experiment Setup | No | The paper details algorithmic parameters (e.g., step sizes, bandwidth, number of iterations) within the pseudo-code and theorems, but these are derived theoretically in terms of other parameters (d, s, epsilon, delta, k). It does not provide concrete, numerical hyperparameter values or system-level training settings characteristic of an empirical experimental setup. |