reproducibilityindex.ai

Convergence Rates of Active Learning for Maximum Likelihood Estimation

Authors: Kamalika Chaudhuri, Sham M. Kakade, Praneeth Netrapalli, Sujay Sanghavi

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Previous theoretical work has rigorously characterized label complexity of active learning, but most of this work has focused on the PAC or the agnostic PAC model. In this paper, we shift our attention to a more general setting maximum likelihood estimation. Provided certain conditions hold on the model class, we provide a two-stage active learning algorithm for this problem. The conditions we require are fairly general, and cover the widely popular class of Generalized Linear Models, which in turn, include models for binary and multi-class classiﬁcation, regression, and conditional random ﬁelds. We provide an upper bound on the label requirement of our algorithm, and a lower bound that matches it up to lower order terms. Our analysis shows that unlike binary classiﬁcation in the realizable case, just a single extra round of interaction is sufﬁcient to achieve near-optimal performance in maximum likelihood estimation.
Researcher Affiliation	Collaboration	Dept. of CS, University of California at San Diego. Email: kamalika@cs.ucsd.edu Dept. of CS and of Statistics, University of Washington. Email: sham@cs.washington.edu Microsoft Research New England. Email:praneeth@microsoft.com Dept. of ECE, The University of Texas at Austin. Email:sanghavi@mail.utexas.edu
Pseudocode	Yes	Algorithm 1 Active Set Select Input: Samples xi, for i = 1, , n 1: Draw m1 samples u.a.r from U, and query their labels to get S1. 2: Use S1 to solve the MLE problem: 1 = argmin 2 3: Solve the following SDP (refer Lemma 3): a = argmina Tr i ai I(xi, 1) 0 ai 1 P 4: Draw m2 examples using probability = 1 +(1 )U where the distribution 1 = a 2 . Query their labels to get S2. 5: Use S2 to solve the MLE problem: 2 = argmin 2
Open Source Code	No	The paper does not provide any explicit statements or links indicating that source code for the described methodology is available.
Open Datasets	No	The paper is theoretical and does not use or describe any specific datasets for training or experimentation, hence no information about public availability.
Dataset Splits	No	The paper is theoretical and does not describe experimental setups with training/validation/test dataset splits.
Hardware Specification	No	The paper is theoretical and does not discuss hardware specifications for running experiments.
Software Dependencies	No	The paper is theoretical and does not discuss specific software dependencies with version numbers for experimental reproducibility.
Experiment Setup	No	The paper is theoretical and focuses on algorithm design and proofs; it does not include details on experimental setup such as hyperparameters or system-level training settings.