reproducibilityindex.ai

Hyperbolic Neural Networks++

Authors: Ryohei Shimizu, YUSUKE Mukuta, Tatsuya Harada

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show the superior parameter efﬁciency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.In this section, we evaluate our methods in comparisons with HNNs and Euclidean counterparts.
Researcher Affiliation	Academia	Ryohei Shimizu1, Yusuke Mukuta1,2, Tatsuya Harada1,2 1The University of Tokyo 2RIKEN AIP
Pseudocode	No	The paper describes methods through mathematical formulations and textual descriptions but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	1The code is available at https://github.com/mil-tokyo/hyperbolic_nn_plusplus.
Open Datasets	Yes	We pre-trained the Poincaré embeddings of the same dimensions as the experimental settings in HNNs, i.e., two, three, ﬁve, and ten dimensions, using the open-source implementation2 to extract several sub-trees whose root nodes are certain abstract hypernymies, e.g., animal. For each sub-tree, MLR layers learn the binary classiﬁcation to predict whether each given node is included. All nodes are divided into 80% training nodes and 20% testing nodes. Footnote 2: https://github.com/facebookresearch/poincare-embeddings
Dataset Splits	No	All nodes are divided into 80% training nodes and 20% testing nodes.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory specifications, or cloud instances) used for running its experiments.
Software Dependencies	No	The implementation of hyperbolic architectures is based on the Geoopt (Kochurov et al., 2020). and We follow the open-source implementation of Fairseq (Ott et al., 2019).
Experiment Setup	Yes	We trained each model for 30 epochs using Riemannian Adam (Becigneul & Ganea, 2019) with a learning rate of 0.001 and a batch size of 16. For the scheduling of the learning rate η, we linearly increased the learning rate for the ﬁrst 4000 iterations as a warm-up, and utilized the inverse square root decay with respect to the number of iterations t thereafter as η = (Dt) 1/2.