reproducibilityindex.ai

Distributional Semantics Meets Multi-Label Learning

Authors: Vivek Gupta, Rahul Wadbude, Nagarajan Natarajan, Harish Karnick, Prateek Jain, Piyush Rai3747-3754

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our approach through an extensive set of experiments on a variety of benchmark datasets, and show that the proposed models perform favorably as compared to state-of-the-art methods for large-scale multi-label learning.
Researcher Affiliation	Collaboration	1School of Computing, University of Utah, 2Computer Science Department, IIT Kanpur 3Microsoft Research Lab, Bangalore
Pseudocode	Yes	Our algorithm for predicting the labels of a new instance is identical to that of SLEEC and is presented for convenience in Algorithm 1. ... Algorithm 2 Learning embeddings via SPPMI factorization (EXMLDS1). ... Algorithm 3 Learning joint label and instance embeddings via SPPMI factorization (EXMLDS3). ... Algorithm 4 Prediction Algorithm with Label Correlations (EXMLDS3 prediction). ... Algorithm 5 Learning joint instance embeddings and regression via gradient decent (EXMLDS4).
Open Source Code	No	Source code will be made available to public later.
Open Datasets	Yes	We conduct experiments on commonly used benchmark datasets from the extreme multi-label classiﬁcation repository provided by the authors of (Prabhu and Varma 2014; Bhatia et al. 2015) 2; these datasets are pre-processed, and have prescribed train-test splits. ... 2 Datasets and Benchmark :https://bit.ly/2IDt Qb S
Dataset Splits	Yes	We conduct experiments on commonly used benchmark datasets from the extreme multi-label classiﬁcation repository provided by the authors of (Prabhu and Varma 2014; Bhatia et al. 2015) 2; these datasets are pre-processed, and have prescribed train-test splits. ... For small datasets, we ﬁx negative sample size to 15 and number of iterations to 35 during neural network training, tuned based on a separate validation set. For large datasets, we ﬁx negative sample size to 2 and number of iterations to 5, tuned on a validation set.
Hardware Specification	No	The paper mentions 'a Linux machine with 40 cores and 128 GB RAM' but does not specify the exact CPU model or other detailed hardware components required for replication.
Software Dependencies	No	The paper states 'Learning Algorithms 2 and 3 are implemented partly in Python and partly in MATLAB' but does not provide specific version numbers for these software packages or any other dependencies.
Experiment Setup	Yes	For small datasets, we ﬁx negative sample size to 15 and number of iterations to 35 during neural network training, tuned based on a separate validation set. For large datasets, we ﬁx negative sample size to 2 and number of iterations to 5, tuned on a validation set. ... We use the same embedding dimensionality, preserve the same number of nearest neighbors for learning embeddings as well as at prediction time, and the same number of data partitions used in SLEEC (Bhatia et al. 2015) for our method EXMLDS1and EXMLDS2. ... embedding size as 50, number of learner for each cluster as 15, number of nearest neighbor as 10, number of embedding and partitioning iteration both 100, gamma as 1, label normalization as true, number of threads as 32.