Cost-Effective Active Learning for Hierarchical Multi-Label Classification

Authors: Yi-Fan Yan, Sheng-Jun Huang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results validate the effectiveness of both the proposed criterion and the selection method. Our empirical study on multiple datasets and with different cost settings demonstrates the advantage of the proposed approach.
Researcher Affiliation Academia Yi-Fan Yan and Sheng-Jun Huang College of Computer Science & Technology, Nanjing University of Aeronautics & Astronautics Collaborative Innovation Center of Novel Software Technology and Industrialization {yanyifan7,huangsj}@nuaa.edu.cn
Pseudocode Yes Algorithm 1 The HALC Algorithm
Open Source Code No The paper does not provide any links to source code or explicit statements about code availability.
Open Datasets Yes We perform the experiments on four datasets. The statistical information of these datasets are summarized in Table 1... we convert the DAG structure of Yeast-go[Barutcuoglu et al., 2006] and Scop-go[Clare, 2003] to hierarchical tree structure by removing subtrees with multiple parents.
Dataset Splits No On each data set, we randomly divide it into two parts with 70% as training set and 30% as test set. In the training set, we randomly sample 5% as initial labeled data and the rest as unlabeled pool for active learning. The paper specifies training and test sets but does not explicitly mention a separate validation set split.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No We use one-vs-all linear SVM as baseline classification model for each label. Lib SVM is used to implement the classifier in our experiments [Chang and Lin, 2011]. The paper mentions 'Lib SVM' but does not provide its version number or any other software with specific versions.
Experiment Setup Yes labels at deeper level cost higher, we manually set the cost of labels at 1:5:10:15 for 4-level hierarchy, and 1:5:10 for 3-level hierarchy. Labels on the same level have the same cost. To examine the robustness of the proposed method to different cost ratios, we further perform experiments in 3 different cost settings, i.e., 1:5:10:15, 1:3:6:9 and 1:2:3:4.