Distributional Semantics Meets Multi-Label Learning
Authors: Vivek Gupta, Rahul Wadbude, Nagarajan Natarajan, Harish Karnick, Prateek Jain, Piyush Rai3747-3754
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach through an extensive set of experiments on a variety of benchmark datasets, and show that the proposed models perform favorably as compared to state-of-the-art methods for large-scale multi-label learning. |
| Researcher Affiliation | Collaboration | 1School of Computing, University of Utah, 2Computer Science Department, IIT Kanpur 3Microsoft Research Lab, Bangalore |
| Pseudocode | Yes | Our algorithm for predicting the labels of a new instance is identical to that of SLEEC and is presented for convenience in Algorithm 1. ... Algorithm 2 Learning embeddings via SPPMI factorization (EXMLDS1). ... Algorithm 3 Learning joint label and instance embeddings via SPPMI factorization (EXMLDS3). ... Algorithm 4 Prediction Algorithm with Label Correlations (EXMLDS3 prediction). ... Algorithm 5 Learning joint instance embeddings and regression via gradient decent (EXMLDS4). |
| Open Source Code | No | Source code will be made available to public later. |
| Open Datasets | Yes | We conduct experiments on commonly used benchmark datasets from the extreme multi-label classification repository provided by the authors of (Prabhu and Varma 2014; Bhatia et al. 2015) 2; these datasets are pre-processed, and have prescribed train-test splits. ... 2 Datasets and Benchmark :https://bit.ly/2IDt Qb S |
| Dataset Splits | Yes | We conduct experiments on commonly used benchmark datasets from the extreme multi-label classification repository provided by the authors of (Prabhu and Varma 2014; Bhatia et al. 2015) 2; these datasets are pre-processed, and have prescribed train-test splits. ... For small datasets, we fix negative sample size to 15 and number of iterations to 35 during neural network training, tuned based on a separate validation set. For large datasets, we fix negative sample size to 2 and number of iterations to 5, tuned on a validation set. |
| Hardware Specification | No | The paper mentions 'a Linux machine with 40 cores and 128 GB RAM' but does not specify the exact CPU model or other detailed hardware components required for replication. |
| Software Dependencies | No | The paper states 'Learning Algorithms 2 and 3 are implemented partly in Python and partly in MATLAB' but does not provide specific version numbers for these software packages or any other dependencies. |
| Experiment Setup | Yes | For small datasets, we fix negative sample size to 15 and number of iterations to 35 during neural network training, tuned based on a separate validation set. For large datasets, we fix negative sample size to 2 and number of iterations to 5, tuned on a validation set. ... We use the same embedding dimensionality, preserve the same number of nearest neighbors for learning embeddings as well as at prediction time, and the same number of data partitions used in SLEEC (Bhatia et al. 2015) for our method EXMLDS1and EXMLDS2. ... embedding size as 50, number of learner for each cluster as 15, number of nearest neighbor as 10, number of embedding and partitioning iteration both 100, gamma as 1, label normalization as true, number of threads as 32. |