Cost-Effective Active Learning from Diverse Labelers

Authors: Sheng-Jun Huang, Jia-Lve Chen, Xin Mu, Zhi-Hua Zhou

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both UCI and real crowdsourcing data sets demonstrate the superiority of our proposed approach on selecting cost-effective queries.
Researcher Affiliation Academia Sheng-Jun Huang1,3, Jia-Lve Chen2,3, Xin Mu2,3 and Zhi-Hua Zhou2,3 1College of Computer Science & Technology, Nanjing University of Aeronautics & Astronautics 2National Key Laboratory for Novel Software Technology, Nanjing University 3Collaborative Innovation Center of Novel Software Technology and Industrialization huangsj@nuaa.edu.cn {chenjl, mux, zhouzh}@lamda.nju.edu.cn
Pseudocode Yes Algorithm 1 The CEAL Algorithm
Open Source Code No No explicit statement or link regarding open-source code for the methodology was found.
Open Datasets Yes We first perform the experimental study on 12 data sets from the University of California-Irvine (UCI) repository [Bache and Lichman, 2013]: austra, german, krvskp, spambase, splice, titato, vehicle and ringnorm.
Dataset Splits Yes For each data set, 5% of the examples are sampled to initialize the labeled set L, 30% examples are hold out as the test set for evaluating the classification model at each iteration, and the rest 65% data are taken as the pool of unlabeled data for active selection.
Hardware Specification No No specific hardware details (GPU/CPU models, memory, etc.) were mentioned for running experiments.
Software Dependencies No We also evaluate the performance on test data by the logistic regression model implemented with LIBLINEAR [Fan et al., 2008] with default parameters.
Experiment Setup No We also evaluate the performance on test data by the logistic regression model implemented with LIBLINEAR [Fan et al., 2008] with default parameters.