Hyperbolic Neural Networks++
Authors: Ryohei Shimizu, YUSUKE Mukuta, Tatsuya Harada
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show the superior parameter efficiency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.In this section, we evaluate our methods in comparisons with HNNs and Euclidean counterparts. |
| Researcher Affiliation | Academia | Ryohei Shimizu1, Yusuke Mukuta1,2, Tatsuya Harada1,2 1The University of Tokyo 2RIKEN AIP |
| Pseudocode | No | The paper describes methods through mathematical formulations and textual descriptions but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1The code is available at https://github.com/mil-tokyo/hyperbolic_nn_plusplus. |
| Open Datasets | Yes | We pre-trained the Poincaré embeddings of the same dimensions as the experimental settings in HNNs, i.e., two, three, five, and ten dimensions, using the open-source implementation2 to extract several sub-trees whose root nodes are certain abstract hypernymies, e.g., animal. For each sub-tree, MLR layers learn the binary classification to predict whether each given node is included. All nodes are divided into 80% training nodes and 20% testing nodes. Footnote 2: https://github.com/facebookresearch/poincare-embeddings |
| Dataset Splits | No | All nodes are divided into 80% training nodes and 20% testing nodes. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory specifications, or cloud instances) used for running its experiments. |
| Software Dependencies | No | The implementation of hyperbolic architectures is based on the Geoopt (Kochurov et al., 2020). and We follow the open-source implementation of Fairseq (Ott et al., 2019). |
| Experiment Setup | Yes | We trained each model for 30 epochs using Riemannian Adam (Becigneul & Ganea, 2019) with a learning rate of 0.001 and a batch size of 16. For the scheduling of the learning rate η, we linearly increased the learning rate for the first 4000 iterations as a warm-up, and utilized the inverse square root decay with respect to the number of iterations t thereafter as η = (Dt) 1/2. |