Learning from Weak-Label Data: A Deep Forest Expedition

Authors: Qian-Wei Wang, Liang Yang, Yu-Feng Li6251-6258

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that the proposed LCForest method compares favorably against the existing state-of-theart multi-label and weak-label learning methods. In this section, we first introduce the experimental setup and then present the evaluation of our proposal compared to several state-of-the-art algorithms on a number of real-world tasks.
Researcher Affiliation Academia Qian-Wei Wang, Liang Yang, Yu-Feng Li National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, 210023, China {wangqw, yangl, liyf}@lamda.nju.edu.cn
Pseudocode Yes Algorithm 1 Label complement in each cascade layer, Algorithm 2 Train LCForest
Open Source Code No The paper does not explicitly provide information about open-source code availability, nor does it provide a link to a code repository.
Open Datasets Yes The yeast data set (Elisseeff and Weston 2001) is a gene function classification data set..., The TMC data set (Srivastava and Zane-Ulman 2005)..., The Scene data set (Boutell et al. 2004) is a labeled image data set..., The Medical data set (Read et al. 2011) contains 978 instances and 1449 features.
Dataset Splits Yes Specifically, in each layer, we manipulate the training set into 5 folds as 5-fold cross-validation. We compared all methods using the same setting. In the rest of this section, we evaluated the performance by performing 5-fold cross-validation.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes As shown in Alg. 2, hyper-parameters employed by LCForest are set as follows: T = 10, K = 5, θ = 0.4. For configuration of random forests, we used one random forest and one completely random forest to encourage the diversity, and each random forest contains 200 decision trees. For configuration of the TIc E method, the max-bepp parameter k = 5, the maximum number of split is M = 500, and the minimum number of total examples in subset is min T = 5. (and similar configurations for comparison methods)