Extreme Multi-label Classification from Aggregated Labels

Authors: Yanyao Shen, Hsiang-Fu Yu, Sujay Sanghavi, Inderjit Dhillon

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both aggregated label XMC and MIML tasks show the advantages over existing approaches. In this section, we empirically verify the effectiveness of EAGLE from multiple standpoints. First, we run simulations to verify and explain the benefit of label assignment as analyzed in Theorem 1. Next, we run synthetic experiments on standard XMC datasets to understand the advantages of EAGLE under multiple aggregation rules. Lastly, for the natural extension of EAGLE in the non-extreme setting (as mentioned in Section 6), we study multiple MIML tasks and show the benefit of EAGLE over standard MIML solution.
Researcher Affiliation Collaboration 1Department of ECE, The University of Texas at Austin, TX, USA. 2Amazon, CA, USA. 3Department of Computer Science, The University of Texas at Austin, TX, USA.
Pseudocode Yes Algorithm 1 GROUP_ROBUST_LABEL_REPR (GRLR) and Algorithm 2 EAGLE
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the methodology described.
Open Datasets Yes We first verify our idea on 4 standard extreme classification tasks3(1 small, 2 mid-size and 1 large), whose detailed statistics are shown in Table 1. (Footnote 3: http://manikvarma.org/downloads/XC/XMLRepository.html). Also, We run a set of synthetic experiments on the standard MNIST & Fashion-MNIST image datasets.
Dataset Splits Yes We randomly collect 20k reviews with 4 sentences for training, 10k reviews with single sentence for validation and 10k reviews with single sentence for testing.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as exact GPU or CPU models.
Software Dependencies No The paper mentions using 'Infer Sent' but does not provide its version number or any other specific software dependencies with their respective versions.
Experiment Setup No The paper states that hyper-parameter search was conducted for learning rate and epoch number, and that a two-layer feed-forward neural network was used 'identical to the setting in (Feng & Zhou, 2017)', but it does not explicitly list the specific values of these or other hyperparameters within the text.