reproducibilityindex.ai

Active Learning for Cost-Sensitive Classification

Authors: Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daumé III, John Langford

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiment with COAL show signiﬁcant improvements in labeling effort and test cost over passive and active baselines. Our experiment with COAL show signiﬁcant improvements in labeling effort and test cost over passive and active baselines. 6. Experiments
Researcher Affiliation	Collaboration	1University of Massachusetts, Amherst, MA 2Microsoft Research, New York, NY 3Uber Advanced Technology Center, Pittsburgh, PA 4University of Maryland, College Park, MD.
Pseudocode	Yes	Algorithm 1 Cost Overlapped Active Learning (COAL) Algorithm 2 MAXCOST Algorithm 3 BINARYSEARCH(BSEARCH)
Open Source Code	No	The paper uses a third-party tool ('Vowpal Wabbit') for implementation but does not state that its own code for COAL is open-source or provide a link.
Open Datasets	Yes	Image Net 20 and 40 are sub-trees of the Image Net hierarchy... The third dataset, RCV1-v2 (Lewis et al., 2004), is a multilabel text- categorization dataset...
Dataset Splits	No	The paper mentions training and testing but does not explicitly provide details about training/test/validation dataset splits, such as percentages, sample counts, or predefined splits.
Hardware Specification	No	The paper does not provide specific hardware details (like GPU/CPU models or memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'Vowpal Wabbit' but does not specify its version or the versions of any other software dependencies used.
Experiment Setup	Yes	There are two tuning parameters in our implementation. First, instead of i, we set the radius of the version space to 0 i 1 (i.e. β = 0 and the log factor i = 2(i 1)2\|G\|K scales with i) and instead tune the constant . This alternate mellowness parameter controls how aggressive the query strategy is. The second parameter is the learning rate used by online linear regression6. While not always the best, we recommend a mellowness setting of 0.01 as it achieves reasonable performance on all three datasets.