reproducibilityindex.ai

Active Learning with Logged Data

Authors: Songbai Yan, Kamalika Chaudhuri, Tara Javidi

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now empirically validate our theoretical results by comparing our algorithm with a few alternatives on several datasets and logging policies. In particular, we conﬁrm that the test error of our classiﬁer drops faster than several alternatives as the expected number of label queries increases.
Researcher Affiliation	Academia	University of California, San Diego.
Pseudocode	Yes	Algorithm 1 Acitve learning with logged data
Open Source Code	No	The paper mentions "Our implementation of above algorithms follows Vowpal Wabbit (vw). Details can be found in Appendix." and in the acknowledgements cites "Vowpal Wabbit. https://github.com/ John Langford/vowpal_wabbit/". This is a third-party tool, not their own source code for the described methodology.
Open Datasets	Yes	Experiments are conducted on synthetic data and 11 datasets from UCI datasets (Lichman, 2013) and LIBSVM datasets (Chang & Lin, 2011).
Dataset Splits	Yes	Speciﬁcally, we ﬁrst randomly select 80% of the whole dataset as training data and the remaining 20% is test data. We randomly select 50% of the training set as logged data, and the remaining 50% is online data.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models or memory used for experiments.
Software Dependencies	No	The paper mentions using Vowpal Wabbit, but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup	Yes	The parameter set p consists of two parameters: Model capacity C (see also item 4 in Appendix F.1). In our theoretical analysis there is a term C := O(log Hδ ) in the bounds, which is known to be loose in practice (Hsu, 2010). Therefore, in experiments, we treat C as a parameter to tune. We try C in {0.01 2k \| k = 0, 2, 4, . . . , 18} Learning rate η (see also item 3 in Appendix F.1). We use online gradient descent with stepsize q try η in {0.0001 2k \| k = 0, 2, 4, . . . , 18}.