Hyperbolic Embedding Inference for Structured Multi-Label Prediction
Authors: Bo Xiong, Michael Cochez, Mojtaba Nayyeri, Steffen Staab
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on 12 datasets show 1) significant improvements in mean average precision; 2) lower number of constraint violations; 3) an order of magnitude fewer dimensions than baselines. |
| Researcher Affiliation | Collaboration | Bo Xiong University of Stuttgart Stuttgart, Germany Michael Cochez Vrije Universiteit Amsterdam Discovery Lab, Elsevier Amsterdam, The Netherlands Mojtaba Nayyeri University of Stuttgart Stuttgart, Germany Steffen Staab University of Stuttgart University of Southampton Stuttgart, Germany |
| Pseudocode | No | The paper describes algorithms and functions in text and mathematical formulas but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is open available at 5. 5https://github.com/xiongbo010/HMI |
| Open Datasets | Yes | We consider 12 datasets that have been used for evaluating multi-label prediction methods [11, 8, 10]. These consist of 8 functional genomic datasets [28], 3 image annotation datasets [29, 30], and 1 text classification dataset [31]. |
| Dataset Splits | Yes | Similar to MBM [11] and its baselines, we sample 30% of the implications and exclusions constraints for training the model. We employ an early-stopping strategy with patience 20 to save training time. |
| Hardware Specification | Yes | We train the models on NVIDIA A100 with 40GB memory. |
| Software Dependencies | No | We implement HMI, HLR and HMC-HLR using PyTorch [34] and train the models on NVIDIA A100 with 40GB memory. We train HMI, HLR and HMI+HLR using Riemannian Adam [35] optimizer implemented by the Geoopt library [36]. |
| Experiment Setup | Yes | We train HMI, HLR and HMI+HLR using Riemannian Adam [35] optimizer implemented by the Geoopt library [36] with a batch size of 4. We set the dropout rate to 0.6 suggested by [14] to avoid the case that the model overfits the small training sets. We employ an early-stopping strategy with patience 20 to save training time. The learning rate is searched from {1e-4, 5e-4, 1e-3, 5e-3, 1e-2}. The penalty weight of the violation is searched from {1e-5, 5e-4, 1e-4, 5e-3, 1e-2} and we also show its impact in an ablation. The best dimension per dataset is searched from {32, 64, 128, 256}. |