Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Navigating Extremes: Dynamic Sparsity in Large Output Spaces

Authors: Nasibullah Nasibullah, Erik Schultheis, Mike Lasby, Yani Ioannou, Rohit Babbar

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 3 Experiments and discussion
Researcher Affiliation Academia 1Department of Computer Science, Aalto University, Helsinki, Finland EMAIL 2Schulich School of Engineering, University of Calgary, Calgary, AB, Canada EMAIL 3Department of Computer Science, University of Bath, Bath, UK EMAIL
Pseudocode No The paper describes methods in prose but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/xmc-aalto/NeurIPS24-dst
Open Datasets Yes The datasets are publicly available at the Extreme Classification Repository3. http://manikvarma.org/downloads/XC/XMLRepository.html
Dataset Splits No Table 1 provides the total number of training instances (N) and test instances (N'), but no explicit details for a validation set split are provided in the paper.
Hardware Specification Yes While we want to demonstrate the memory efficiency of our algorithms, in order to enable meaningful comparison with existing methods, we run all our experiments on a NVidia A100 GPU, and measure the memory consumption using torch.cuda.max_memory_allocated.
Software Dependencies No The paper mentions software like PyTorch, CUDA kernels, and `torch.amp` but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup Yes We present the hyperparametersettingsusedduringtrainingin Table8. Fortheencoderandclassifier, we employ two separate optimizers: Adam W for both components, except in the case of LF-Amazon Titles131K where Adam and SGD are utilized. All experiments are conducted using half-precision float16 types, except for Amazon-3M and LF-Amazon Titles-131K, which use the bfloat16 type. We apply a cosine scheduler with warmup, as specified in the table. The weight decay values are set separately: 0.01 for the encoder and 1.0e-4 for the final classification layer. We use the squared hinge loss function for all datasets except for LF-Amazon Titles-131K, where we use binary cross-entropy (BCE) loss with positive labels. Table 9: DST and other related hyperparameter settings for different datasets.