Sparse Local Embeddings for Extreme Multi-label Classification

Authors: Kush Bhatia, Himanshu Jain, Purushottam Kar, Manik Varma, Prateek Jain

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted extensive experiments on several real-world, as well as benchmark data sets and compared our method against state-of-the-art methods for extreme multi-label classification. Experiments reveal that SLEEC can make significantly more accurate predictions then the state-of-the-art methods including both embedding-based (by as much as 35%) as well as tree-based (by as much as 6%) methods.
Researcher Affiliation Collaboration Microsoft Research, India Indian Institute of Technology Delhi, India Indian Institute of Technology Kanpur, India {t-kushb,prajain,manik}@microsoft.com himanshu.j689@gmail.com, purushot@cse.iitk.ac.in
Pseudocode Yes Algorithm 1 SLEEC: Train Algorithm, Algorithm 2 SLEEC: Test Algorithm, Sub-routine 3 SLEEC: SVP, Sub-routine 4 SLEEC: ADMM
Open Source Code No The paper states 'The implementation for LEML and Fast XML was provided by the authors. We implemented the remaining algorithms and ensured that the published results could be reproduced and were verified by the authors wherever possible.' but does not provide a link or explicit statement about making their own code publicly available.
Open Datasets Yes Experiments were carried out on multi-label data sets including Ads1M [15] (1M labels), Amazon [23] (670K labels), Wiki LSHTC (320K labels), Delicious Large [24] (200K labels) and Wiki10 [25] (30K labels). All the data sets are publically available except Ads1M which is proprietary and is included here to test the scaling capabilities of SLEEC. Unfortunately, most of the existing embedding techniques do not scale to such large data sets. We therefore also present comparisons on publically available small data sets such as Bib Te X [26], Media Mill [27], Delicious [28] and EURLex [29].
Dataset Splits No The paper mentions 'limited validation on a validation set' but does not provide specific details on the train/validation/test dataset splits (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification No The paper mentions 'on a single core' for training time, but does not provide specific details on CPU models, GPU models, memory, or other hardware specifications used for experiments.
Software Dependencies No The paper mentions that the authors implemented some algorithms, but it does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions).
Experiment Setup Yes Most of SLEEC s hyper-parameters were kept fixed including the number of clusters in a learner NTrain/6000 , embedding dimension (100 for the small data sets and 50 for the large), number of learners in the ensemble (15), and the parameters used for optimizing (3). The remaining two hyper-parameters, the k in k NN and the number of neighbours considered during SVP, were both set by limited validation on a validation set.