reproducibilityindex.ai

On Gleaning Knowledge from Multiple Domains for Active Learning

Authors: Zengmao Wang, Bo Du, Lefei Zhang, Liangpei Zhang, Ruimin Hu, Dacheng Tao

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed method is veriﬁed with newsgroups and handwritten digits data recognition tasks, where it outperforms the state-of-the-art methods. We tested the proposed method on 20 tasks in newsgroup and handwritten digit recognition.
Researcher Affiliation	Collaboration	1 School of Computer, Wuhan University 2 State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing 3National Engineering Research Center for Multimedia Software, School of Computer, Wuhan University 4 UBTech Sydney AI Institute, The School of Information Technologies, The University of Sydney
Pseudocode	No	The paper provides mathematical formulations and descriptions of its algorithm, but it does not include a dedicated section or figure explicitly labeled as "Pseudocode" or "Algorithm" in a structured, code-like format.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	The 20 Newsgroups data set consists of a collection of approximately 20,000 newsgroup documents, partitioned into 20 different categories. The USPS and MNIST handwritten digit data sets [Long et al., 2014] represent the various fonts of each digit from 1 to 10 using 256-dimension features normalized to the range [0, 1].
Dataset Splits	Yes	For the positive samples in each task, 50% for testing, one sample as the initial labeled data, and the other near 50% as the unlabeled data for the active learning. For the negative samples in each task, we also randomly divided them into three parts: 20% for testing, 60% as the initial labeled data, and the other 20% as the unlabeled data for the active learning.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	For the classiﬁer, without loss of generality, support vector machine (SVM) with a Gaussian kernel was adopted with the Lib SVM tool [Chang and Lin, 2011]. While Lib SVM is mentioned, a specific version number is not provided.
Experiment Setup	Yes	There are two important parameters in the SVM classiﬁer: the kernel width parameter g and the penalty parameter C. For convenience, we set the two parameters with empirical values of C = 100 and g = 0.05. For a fair comparison, we adopted the same kernel parameter in all the methods. For the methods with a tradeoff parameter, we ﬁxed it as 10, as in [Hunag and Chen, 2016]. At each iteration, ﬁve samples were selected for labeling, and we stopped the iteration loop when 20 iterations were reached.