reproducibilityindex.ai

Scalable Generative Models for Multi-label Learning with Missing Labels

Authors: Vikas Jain, Nirbhay Modhe, Piyush Rai

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We report both quantitative and qualitative results for our framework on several benchmark data sets, comparing it with a number of state-of-the-art methods.
Researcher Affiliation	Academia	1Department of Computer Science and Enginerring, IIT Kanpur, Kanpur 208016, UP, India.
Pseudocode	No	The paper describes the inference procedures (Gibbs Sampling, EM) in detail using equations and prose, but does not include any formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement or link indicating that source code for the methodology is openly available.
Open Datasets	Yes	The statistics of data sets we use in our experiments are summarized in Table 1. Dataset N Ntest D L Bibtex 4880 2515 1836 159 Mediamill 30993 12914 120 101 Eurlex-4K 15539 3809 5000 3993 Movielens 4000 2040 29 3952 RCV 623847 155962 47236 2456 Wikipedia 14146 6616 101938 30938
Dataset Splits	No	Table 1 provides N (number of training examples) and Ntest (number of test examples) for each dataset, indicating a train-test split. The paper states, 'We select the other two hyperparameters λw and K (number of latent factors) using cross-validation.' However, it does not specify the explicit details of the validation splits (e.g., exact percentages, specific counts for a validation set, or the number of folds for cross-validation) needed for reproduction.
Hardware Specification	Yes	we only ran our model in the online setting on a moderate 4 core processor with 8GB RAM.
Software Dependencies	No	The paper discusses various models and algorithms but does not specify any software dependencies (e.g., programming languages, libraries, frameworks) with version numbers.
Experiment Setup	Yes	For our model, we set the hyperparameters λu and λv to 0.001, which works well on all the data sets we experimented with. We select the other two hyperparameters λw and K (number of latent factors) using cross-validation. For the conjugate gradient (CG) method used by the M step of our inference algorithm, we run 5 iterations, which was found to be sufﬁcient. For online EM, for each data set, we use mini-batch sizes of 1024 and 4096 and report the one which gives better results.