Active Learning for Cost-Sensitive Classification
Authors: Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daumé III, John Langford
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiment with COAL show significant improvements in labeling effort and test cost over passive and active baselines. Our experiment with COAL show significant improvements in labeling effort and test cost over passive and active baselines. 6. Experiments |
| Researcher Affiliation | Collaboration | 1University of Massachusetts, Amherst, MA 2Microsoft Research, New York, NY 3Uber Advanced Technology Center, Pittsburgh, PA 4University of Maryland, College Park, MD. |
| Pseudocode | Yes | Algorithm 1 Cost Overlapped Active Learning (COAL) Algorithm 2 MAXCOST Algorithm 3 BINARYSEARCH(BSEARCH) |
| Open Source Code | No | The paper uses a third-party tool ('Vowpal Wabbit') for implementation but does not state that its own code for COAL is open-source or provide a link. |
| Open Datasets | Yes | Image Net 20 and 40 are sub-trees of the Image Net hierarchy... The third dataset, RCV1-v2 (Lewis et al., 2004), is a multilabel text- categorization dataset... |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide details about training/test/validation dataset splits, such as percentages, sample counts, or predefined splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (like GPU/CPU models or memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Vowpal Wabbit' but does not specify its version or the versions of any other software dependencies used. |
| Experiment Setup | Yes | There are two tuning parameters in our implementation. First, instead of i, we set the radius of the version space to 0 i 1 (i.e. β = 0 and the log factor i = 2(i 1)2|G|K scales with i) and instead tune the constant . This alternate mellowness parameter controls how aggressive the query strategy is. The second parameter is the learning rate used by online linear regression6. While not always the best, we recommend a mellowness setting of 0.01 as it achieves reasonable performance on all three datasets. |