reproducibilityindex.ai

Beyond Disagreement-Based Agnostic Active Learning

Authors: Chicheng Zhang, Kamalika Chaudhuri

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we provide such an algorithm. Our solution is based on two key contributions, which may be of independent interest. The ﬁrst is a general connection between conﬁdence-rated predictors and active learning. ... Our second key contribution is a novel conﬁdence-rated predictor with guaranteed error that applies to any general classiﬁcation problem. We show that our predictor is optimal in the realizable case, in the sense that it has the lowest abstention rate out of all predictors guaranteeing a certain error. Moreover, we show how to extend our predictor to the agnostic setting. Combining the label query algorithm with our novel conﬁdence-rated predictor, we get a general active learning algorithm consistent in the agnostic setting. We provide a characterization of the label complexity of our algorithm, and show that this is better than the bounds known for disagreementbased active learning in general. Finally, we show that for linear classiﬁcation with respect to the uniform distribution and log-concave distributions, our bounds reduce to those of [3, 4].
Researcher Affiliation	Academia	Chicheng Zhang University of California, San Diego 9500 Gilman Drive, La Jolla, CA 92093 chichengzhang@ucsd.edu Kamalika Chaudhuri University of California, San Diego 9500 Gilman Drive, La Jolla, CA 92093 kamalika@cs.ucsd.edu
Pseudocode	Yes	Algorithm 1 Active Learning Algorithm: Outline; Algorithm 2 An Adaptive Algorithm for Label Query Given Target Excess Error; Algorithm 3 Conﬁdence-rated Predictor
Open Source Code	No	The paper does not provide any links to open-source code or explicitly state that code for their methodology is available.
Open Datasets	No	The paper does not refer to specific, publicly available datasets. It discusses theoretical properties related to data distributions but not actual datasets used in experiments.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments with dataset splits. Therefore, no validation split information is provided.
Hardware Specification	No	The paper is theoretical and does not discuss hardware used for experiments.
Software Dependencies	No	The paper is theoretical and does not specify any software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an empirical experimental setup with hyperparameters or training configurations.