reproducibilityindex.ai

Gone Fishing: Neural Active Learning with Fisher Embeddings

Authors: Jordan Ash, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that BAIT outperforms the previous state of the art on both classiﬁcation and regression problems, and is ﬂexible enough to be used with a variety of model architectures.
Researcher Affiliation	Collaboration	Jordan T. Ash Microsoft Research NYC ash.jordan@microsoft.com Surbhi Goel Microsoft Research NYC goel.surbhi@microsoft.com Akshay Krishnamurthy Microsoft Research NYC akshaykr@microsoft.com Sham Kakade Microsoft Research NYC University of Washington sham.kakade@microsoft.com
Pseudocode	Yes	Algorithm 1 BAIT Require: Neural network f(x; ), unlabeled pool of examples U, initial number of examples B0, number of iterations T, number of examples in a batch B. 1: Initialize S by drawing B0 labeled points from U & ﬁt model on S: 1 = argmin ES[ (x, y; )] 2: for t = 1, 2, . . . , T: {forward greedy optimization} do 3: Compute I( L t ) = 1 \|U\| P x2U I(x; L t ) 4: Initialize M0 = λI + 1 \|S\| P x2S I(x; L t ) 5: for i = 1, 2, . . . , 2B: do 6: x = argminx2U tr((Mi + I(x; L t )) 7: Mi+1 Mi + I( x; L t ), S x 8: end for 9: for i = 2B, 2B 1, ..., B: {backward greedy optimization} do 10: x = argminx2S tr((Mi I(x; L t )) 11: Mi 1 Mi I( x; L t ), S S \ x 12: end for 13: Train model on S: t = argmin ES[ (x, y; )]. 14: end for 15: return Final model T +1.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experi- mental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets	Yes	We consider three datasets. Using an MLP, we perform active learning on both MNIST data and Open ML dataset 155. We also use the SVHN dataset [37] of color digit images with both an MLP and an 18-layer Res Net. Last we explore the CIFAR-10 object dataset [38] with a Res Net.
Dataset Splits	Yes	Each learner is initialized with 100 randomly sampled labeled points, and each experiment is repeated ﬁve times with different random seeds. Shadowed regions in plots denote standard error. More empirical details can be found in Appendix Section C.
Hardware Specification	No	Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] In appendix 4. The appendix (which is not provided in the input text) would contain the specific details.
Software Dependencies	No	The paper mentions training with 'Adam variant of SGD' but does not specify version numbers for any software or libraries.
Experiment Setup	Yes	All Res Nets are trained with a learning rate of 0.01, and all other models (including linear models shown earlier) are trained with a learning rate of 0.0001. We ﬁt parameters using the Adam variant of SGD, and use standard data augmentation for all CIFAR-10 experiments. Like other deep active learning work, we avoid warm-starting and retrain model parameters from a random initialization after each query round [5]. Each learner is initialized with 100 randomly sampled labeled points, and each experiment is repeated ﬁve times with different random seeds. Shadowed regions in plots denote standard error. More empirical details can be found in Appendix Section C.