Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Semi-Supervised Multi-Modal Learning with Incomplete Modalities

Authors: Yang Yang, De-Chuan Zhan, Xiang-Rong Sheng, Yuan Jiang

IJCAI 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	experiments on 15 real world multi-modal datasets validate the effectiveness of our method.
Researcher Affiliation	Academia	Yang Yang, De-Chuan Zhan, Xiang-Rong Sheng, Yuan Jiang National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China EMAIL
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about the availability of open-source code for the described methodology, nor does it include any links to code repositories.
Open Datasets	Yes	Data Sets: In this paper, we conduct experiments on 7 two modalities datasets and 8 multiple modalities datasets. In detail, two modal datasets include: Movie dataset is extracted from IMDb, which has 617 movies of 17 genres, and there are two data matrices describing the same movies, i.e., keywords matrix and actors matrix. The main goal is to ﬁnd the genre of the movies; Citeseer dataset [Sen et al., 2008] is originally made of 4 modalities, i.e., content, inbound, outbound, cites, on the same documents. We follow [Bisson and Grimal, 2012] to choose the content and cites modalities in our experiment. Web KB dataset [Sen et al., 2008] contains webpages collected from 4 universities: Cornell, Texas, Wisconsin and Washington, which have 5 categories, i.e., student, project, course, stuff and faculty. Multiple modal datasets include: News Group dataset [Bisson and Grimal, 2012] is of 6 groups extracted from the 20 Newsgroup datasets, i.e., M2, M5, M10, NG1, NG2, NG3. Every group contains 10 sub- sets, and we choose the ﬁrst subset for all 6 groups in our experiment, i.e., News-M2, News-M5, News-M10, News-NG1, News-NG2 and News-NG3, respectively. 3-Source Text data (3Sources)(http://mlg.ucd.ie/datasets/3sources.html) is collected from three online news sources: BBC, Reuters, and Guardian.
Dataset Splits	Yes	For all datasets, we randomly select 70% for training and the remains are for test. For both the training set and test set. As in [Li et al., 2014], in each split, we randomly select 10% to 90% examples, with 20% as interval, as homogeneous examples with complete modality, and the remains are incomplete instances, i.e., in Web KB datasets, they are described by either the content or the citation modality, but not both. For all the examples, we randomly choose 30% as the labeled data, and the left 70% as unlabeled ones. In the training phase, the parameters λ1 and λ2 are selected by 5-fold cross validation from {10 5, 10 4, , 104, 105} with further splittings on the training datasets only, i.e., there is no overlap between the test set and the validation set for parameter picking up.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	In the training phase, the parameters λ1 and λ2 are selected by 5-fold cross validation from {10 5, 10 4, , 104, 105} with further splittings on the training datasets only, i.e., there is no overlap between the test set and the validation set for parameter picking up. Empirically, when the variations between the objective value of Eq. 9 is less than 10 6 in iteration, we treat SLIM converged.