New Balanced Active Learning Model and Optimization Algorithm

Authors: Xiaoqian Wang, Yijun Huang, Ji Liu, Heng Huang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct experiments to evaluate our method. We select a subset of samples from the training data to quire labels and then construct classification models according to these labeled data. The goal is to pick out the most representative samples such that the constructed classification model maintains high discriminative power.
Researcher Affiliation Academia 1 Department of Electrical and Computer Engineering, University of Pittsburgh, PA 15261, USA 2 Department of Computer Science, University of Rochester, NY 14627, USA
Pseudocode Yes Algorithm 1 The General Framework of FISTA
Open Source Code No The paper provides a link to LIBSVM, a third-party toolbox, but does not state that the authors' own code is publicly available.
Open Datasets Yes Aggregation [Gionis et al., 2007], Binalpha1, Compound [Zahn, 1971], R15 [Veenman et al., 2002], Breast Cancer and Seeds. The last two datasets are downloaded from UCI repository [Lichman, 2013].
Dataset Splits Yes In the active learning experiments, we randomly pick out half of the data for training while the other half for testing. ... We use two-fold cross validation and record the average classification accuracy among the two repetitions.
Hardware Specification No The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies No The paper mentions using the 'libsvm toolbox' but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup Yes For methods involving hyper-parameters, i.e., λ for QUIRE in Eq. (5) of [Huang et al., 2010], γ for RRSS in Eq. (6) of [Nie et al., 2013], and λ in Eq. (3) of our method, we tune the hyper-parameters in the range of {10 3, 10 2, , 103}. For K-means clustering, we set the number of clusters as the ground truth. We use 100 random initialization for K-means and retain the best result among these 100 repetitions with respect to K-means objective function value. For our method, we use the K-means clustering result as the group allocation of training data.