Hyperbolic Neural Networks
Authors: Octavian Ganea, Gary Becigneul, Thomas Hofmann
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that, even if hyperbolic optimization tools are limited, hyperbolic sentence embeddings either outperform or are on par with their Euclidean variants on textual entailment and noisy-prefix recognition tasks. On a series of experiments and datasets we showcase the effectiveness of our hyperbolic neural network layers compared to their "classic" Euclidean variants on textual entailment and noisy-prefix recognition tasks. We evaluate our method on two tasks. |
| Researcher Affiliation | Academia | Octavian-Eugen Ganea Dept. of Computer Science ETH Zürich Zurich, Switzerland; Gary Bécigneul Dept. of Computer Science ETH Zürich Zurich, Switzerland; Thomas Hofmann Dept. of Computer Science ETH Zürich Zurich, Switzerland; Equal contribution, correspondence at {octavian.ganea,gary.becigneul}@inf.ethz.ch |
| Pseudocode | No | The paper presents mathematical formulations for its generalized operations and neural network architectures but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | Yes | Our data and TensorFlow [1] code are publicly available9. 9https://github.com/dalab/hyperbolic_nn |
| Open Datasets | Yes | SNLI [7]. It consists of 570K training, 10K validation and 10K test sentence pairs. We thus build synthetic datasets PREFIX-Z% (for Z being 10, 30 or 50) as follows: for each random first sentence... 500K training, 10K validation and 10K test pairs. |
| Dataset Splits | Yes | SNLI [7]. It consists of 570K training, 10K validation and 10K test sentence pairs. We thus build synthetic datasets PREFIX-Z% (for Z being 10, 30 or 50) as follows: for each random first sentence... 500K training, 10K validation and 10K test pairs. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU/CPU models, memory, or cloud computing instance types used for running the experiments. |
| Software Dependencies | No | The paper mentions "TensorFlow" but does not provide a specific version number for it or any other software dependency relevant to reproducibility. |
| Experiment Setup | Yes | In our setting, we embed the two sentences using two distinct hyperbolic RNNs or GRUs. The sentence embeddings are then fed together with their squared distance (hyperbolic or Euclidean, depending on their geometry) to a FFNN (Euclidean or hyperbolic, see Sec. 3.2) which is further fed to an MLR (Euclidean or hyperbolic, see Sec. 3.1) that gives probabilities of the two classes (entailment vs neutral). We use cross-entropy loss on top. For the results shown in Tab. 1, we run each model (baseline or ours) exactly 3 times and report the test result corresponding to the best validation result from these 3 runs. |