Hierarchical Multi-Label Classification Networks
Authors: Jonatas Wehrmann, Ricardo Cerri, Rodrigo Barros
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate its performance in 21 datasets from four distinct domains, and we compare it against the current HMC state-of-the-art approaches. Results show that HMCN substantially outperforms all baselines with statistical significance, arising as the novel state-of-the-art for HMC. |
| Researcher Affiliation | Academia | 1School of Technology, Pontifícia Universidade Católica do Rio Grande do Sul 2Universidade Federal de São Carlos. |
| Pseudocode | No | The paper describes the architecture and mathematical formulations, but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | All algorithms are executed over 21 freely-available datasets related to either protein function prediction (Vens et al., 2008), annotation of medical or microalgae images (Dimitrovski et al., 2011), or text classification (Lewis et al., 2004). |
| Dataset Splits | Yes | Table 1 presents the characteristics of the employed datasets. |
| Hardware Specification | No | The paper mentions: 'We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs that were used for running the experiments.' However, it does not specify the exact GPU model or any other hardware details. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' but does not specify any software libraries (e.g., TensorFlow, PyTorch) or their version numbers. |
| Experiment Setup | Yes | For training our networks, we use the Adam optimizer with learning rate of 1 10 3 and remaining parameters as suggested in (Kingma & Ba, 2014). For the HMCNF version, the fully-connected layers comprise 384 Re LU neurons, followed by a batch normalization, residual connections, and dropout of 60%. Dropout is important given that these models could easily overfit the small training sets. |