Efficient active learning of sparse halfspaces with arbitrary bounded noise

Authors: Chicheng Zhang, Jie Shen, Pranjal Awasthi

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study active learning of homogeneous s-sparse halfspaces in Rd under the setting where the unlabeled data distribution is isotropic log-concave and each label is flipped with probability at most for a parameter 2 , known as the bounded noise. This is the first efficient algorithm with label complexity polynomial in 1 1 2 in this setting, which is label-efficient even for arbitrarily close to 1 2. Our active learning algorithm and its theoretical guarantees also immediately translate to new state-of-the-art label and sample complexity results for full-dimensional active and passive halfspace learning under arbitrary bounded noise. We develop an attribute-efficient learning algorithm that runs in polynomial time, and achieves a label complexity of O s (1 2 )4 polylog provided that the underlying Bayes classifier is an s-sparse halfspace. Theorem 2 (Main result). Suppose Algorithm 1 is run under a distribution D such that Assumptions 1 and 2 are satisfied. Then with probability 1 δ, it returns a halfspace u such that err(h u, D) err(hu, D) . Moreover, our algorithm tolerates any noise rate 2 [0, 1/2), and asks for a total of O # s (1 2 )4 polylog labeled examples.
Researcher Affiliation Collaboration Chicheng Zhang University of Arizona chichengz@cs.arizona.edu Jie Shen Stevens Institute of Technology jie.shen@stevens.edu Pranjal Awasthi Google Research and Rutgers University pranjalawasthi@google.com
Pseudocode Yes Algorithm 1 Main algorithm, Algorithm 2 REFINE, Algorithm 3 INITIALIZE
Open Source Code No The paper does not mention releasing open-source code or provide any links to a code repository.
Open Datasets No The paper operates under theoretical assumptions about data distributions (e.g., "isotropic log-concave distribution") and does not refer to specific, named, publicly available datasets for training or evaluation.
Dataset Splits No The paper is theoretical and does not describe experimental dataset splits for training, validation, or testing, nor does it refer to standard predefined splits for any dataset.
Hardware Specification No The paper focuses on theoretical algorithm design and complexity analysis. It does not describe any specific hardware (e.g., GPU/CPU models, memory) used to run experiments.
Software Dependencies No The paper describes algorithms and theoretical guarantees. It does not mention any specific software dependencies or their version numbers that would be needed to reproduce experiments.
Experiment Setup No The paper details algorithmic parameters (e.g., step sizes, bandwidth, number of iterations) within the pseudo-code and theorems, but these are derived theoretically in terms of other parameters (d, s, epsilon, delta, k). It does not provide concrete, numerical hyperparameter values or system-level training settings characteristic of an empirical experimental setup.