Learning with Marginalized Corrupted Features and Labels Together
Authors: Yingming Li, Ming Yang, Zenglin Xu, Zhongfei Zhang
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive evaluations on three benchmark data sets demonstrate that RMCV outstands with a superior performance in comparison with state-of-the-art methods. |
| Researcher Affiliation | Academia | Yingming Li , Ming Yang , Zenglin Xu , and Zhongfei (Mark) Zhang School of Computer Science and Engineering, Big Data Research Center University of Electronic Science and Technology of China Department of Computer Science, State University of New York at Binghamton, NY, USA |
| Pseudocode | Yes | Algorithm 1: RMCV Algorithm |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | All data sets are obtained from http://mulan.sourceforge.net/datasets-mlc.html. We have used three multi-label datasets, namely Bibtex, Bookmarks, and Enron for experimentation purpose. Their statistics is described in Table 2. |
| Dataset Splits | Yes | To find the optimal number of the stacked layers, we perform model selection on a hold-out validation set, adding layers until the F1 score cannot be improved. Since there is no fixed split in the Bookmarks data set in Mulan, we use a fixed training set of 80% of the data, and evaluate the performance of our predictions on the fixed test set of 20% of the data. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory, or cluster specifications) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We follow the setup of (Chen, Zheng, and Weinberger 2013) and weigh each example in a tf-idflike fashion to give more weight on the losses from rare tags during training. The best performance tends to be achieved by RMCV with blankout corruption with high corruption levels, i.e., when q is at about 0.8. |