Hyperbolic Neural Networks

Authors: Octavian Ganea, Gary Becigneul, Thomas Hofmann

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that, even if hyperbolic optimization tools are limited, hyperbolic sentence embeddings either outperform or are on par with their Euclidean variants on textual entailment and noisy-prefix recognition tasks. On a series of experiments and datasets we showcase the effectiveness of our hyperbolic neural network layers compared to their "classic" Euclidean variants on textual entailment and noisy-prefix recognition tasks. We evaluate our method on two tasks.
Researcher Affiliation Academia Octavian-Eugen Ganea Dept. of Computer Science ETH Zürich Zurich, Switzerland; Gary Bécigneul Dept. of Computer Science ETH Zürich Zurich, Switzerland; Thomas Hofmann Dept. of Computer Science ETH Zürich Zurich, Switzerland; Equal contribution, correspondence at {octavian.ganea,gary.becigneul}@inf.ethz.ch
Pseudocode No The paper presents mathematical formulations for its generalized operations and neural network architectures but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code Yes Our data and TensorFlow [1] code are publicly available9. 9https://github.com/dalab/hyperbolic_nn
Open Datasets Yes SNLI [7]. It consists of 570K training, 10K validation and 10K test sentence pairs. We thus build synthetic datasets PREFIX-Z% (for Z being 10, 30 or 50) as follows: for each random first sentence... 500K training, 10K validation and 10K test pairs.
Dataset Splits Yes SNLI [7]. It consists of 570K training, 10K validation and 10K test sentence pairs. We thus build synthetic datasets PREFIX-Z% (for Z being 10, 30 or 50) as follows: for each random first sentence... 500K training, 10K validation and 10K test pairs.
Hardware Specification No The paper does not specify any hardware details such as GPU/CPU models, memory, or cloud computing instance types used for running the experiments.
Software Dependencies No The paper mentions "TensorFlow" but does not provide a specific version number for it or any other software dependency relevant to reproducibility.
Experiment Setup Yes In our setting, we embed the two sentences using two distinct hyperbolic RNNs or GRUs. The sentence embeddings are then fed together with their squared distance (hyperbolic or Euclidean, depending on their geometry) to a FFNN (Euclidean or hyperbolic, see Sec. 3.2) which is further fed to an MLR (Euclidean or hyperbolic, see Sec. 3.1) that gives probabilities of the two classes (entailment vs neutral). We use cross-entropy loss on top. For the results shown in Tab. 1, we run each model (baseline or ours) exactly 3 times and report the test result corresponding to the best validation result from these 3 runs.