OAK: Enriching Document Representations using Auxiliary Knowledge for Extreme Classification

Authors: Shikhar Mohan, Deepak Saini, Anshul Mittal, Sayak Ray Chowdhury, Bhawna Paliwal, Jian Jiao, Manish Gupta, Manik Varma

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate the proposed OAK method for the Auxiliary Data enhanced XC task in three ways. Firstly, through comparisons with other leading methods which employ different ways to leverage auxiliary data we demonstrate the superiority of OAK s design choices. Secondly, via ablations we detail how each component of our architecture is crucial to OAK s performance. Thirdly, we analyse our method s performance on tail data – rare documents and rare labels. Table 3. Results on public benchmark datasets. OAK offers 5% higher P@1 on standard XC benchmark datasets.
Researcher Affiliation Industry 1Microsoft, India 2Microsoft, USA 3Microsoft Research, India.
Pseudocode Yes Algorithm 1 Augmentation Module Training
Open Source Code No The code will be released publicly upon acceptance of this paper.
Open Datasets Yes The Wikipedia datasets are created from publicly available Wikipedia dumps1. 1https://dumps.wikimedia.org/enwiki/ 20220520/
Dataset Splits No Table 2 shows summary of dataset statistics. Dataset # Train Docs # Labels (L) # Test Docs Avg. Docs/label Avg. labels/Doc AK Types # AKPs (M) Avg. AKPs/Doc (Does not mention validation split).
Hardware Specification Yes We train this model for 300 epochs on 2x NVidia A100-80GB GPUs for all datasets
Software Dependencies No The paper mentions using a 'Distil BERT-base encoder', 'Adam W', and 'Sparse Adam' (with a PyTorch link), but does not provide specific version numbers for these software components or libraries.
Experiment Setup Yes We train this model for 300 epochs on 2x NVidia A100-80GB GPUs for all datasets, with a batch size of 1024 and a linear LR scheduler with warmup.