Fast Multi-Instance Multi-Label Learning

Authors: Sheng-Jun Huang, Wei Gao, Zhi-Hua Zhou

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that the performance of MIMLfast is highly competitive to state-of-the-art techniques, whereas its time cost is much less; particularly, on a data set with 30K bags and 270K instances, where none of existing approaches can return results in 24 hours, MIMLfast takes only 12 minutes.
Researcher Affiliation Academia Sheng-Jun Huang Wei Gao Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {huangsj, gaow, zhouzh}@lamda.nju.edu.cn
Pseudocode Yes The pseudo code of MIMLfast is presented in Algorithm 1.
Open Source Code No The paper does not contain an explicit statement or a link indicating that the source code for the MIMLfast methodology is publicly available.
Open Datasets Yes Experiments are performed on 6 moderate-sized data sets, including Letter Frost (Briggs, Fern, and Raich 2012), Letter Carroll (Briggs, Fern, and Raich 2012), MSRC v2 (Winn, Criminisi, and Minka 2005), Reuters (Sebastiani 2002) Bird Song (Briggs, Fern, and Raich 2012) and Scene (Zhou and Zhang 2007), and 2 large data sets, including Corel5K (Duygulu et al. 2002) and MSRA (Li, Wang, and Hua 2009).
Dataset Splits Yes For each data set, 2/3 of the data are randomly sampled for training, and the remaining examples are taken as test set. We repeat the random data partition for thirty times, and report the average results over the thirty repetitions. In our experiments, we sample a small validation set from the training data, and stop the training once the ranking loss does not decrease on the validation set. The parameters are selected by 3-fold cross validation on the training data with regard to ranking loss.
Hardware Specification Yes All the experiments are performed on a machine with 16 2.60 GHz CPUs and 32GB main memory.
Software Dependencies No The paper discusses the use of various algorithms and techniques like SGD, but it does not specify the versions of any programming languages, libraries, or software packages used in the implementation of the experiments.
Experiment Setup Yes The parameters are selected by 3-fold cross validation on the training data with regard to ranking loss. The candidate values for the parameters are as below: m {50, 100, 200}, C {1, 5, 10}, K {1, 5, 10, 15}, γ0 {0.0001, 0.0005, 0.001, 0.005} and η {10 5, 10 6}.