reproducibilityindex.ai

Learning Context-dependent Label Permutations for Multi-label Classification

Authors: Jinseok Nam, Young-Bum Kim, Eneldo Loza Mencia, Sunghyun Park, Ruhi Sarikaya, Johannes Fürnkranz

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on three public multi-label classiﬁcation benchmarks show that our proposed dynamic label ordering approach based on reinforcement learning outperforms recurrent neural networks with ﬁxed label ordering across both bipartition and ranking measures on all the three datasets. We analyze both techniques empirically on datasets with different characteristics and in comparison to static baseline sequence ordering strategies.
Researcher Affiliation	Collaboration	1Amazon, Seattle, Washington, USA 2Knowledge Engineering, TU Darmstadt, Darmstadt, Hessen, Germany.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing code or links to source code for the methodology described.
Open Datasets	Yes	We carried out our experiments on three multi-label datasets from the XML Extreme Classiﬁcation Repository2 and their statistics are given in Table 1. 2http://manikvarma.org/downloads/XC/ XMLRepository.html accessed 2019-01-12.
Dataset Splits	Yes	We set aside 10% of the training data as the validation sets.
Hardware Specification	Yes	All RNN-based MLC models were trained on NVIDIA Tesla V100
Software Dependencies	No	The paper mentions software components like 'Adam', 'layer normalization', 'variational dropout', and 'gated recurrent units (GRUs)' but does not provide specific version numbers for any libraries or frameworks used, such as PyTorch, TensorFlow, or Python.
Experiment Setup	Yes	The dimensionality of label embeddings and hidden activations of our proposed approach on all the datasets were 512 and 2048, respectively. For AC, the number of samples K in Eq. (8) set to 1, discount factor γ {0.1, 0.3, 0.6, 0.9, 0.99} and entropy regularization parameter β {0.01, 0.0001} were chosen based on the performance on the validation set for each dataset. For RNN training, layer normalization (Ba et al., 2016) and variational dropout (Gal & Ghahramani, 2016) were applied. In addition to the use of variational dropout for RNNs, we also applied plain dropout on input features with probability 0.2 or 0.5 when overﬁtting was observed. As optimization algorithm, we used Adam (Kingma & Ba, 2015) with a learning rate of 0.0001 and minibatches of size 128.