Poincare Glove: Hyperbolic Word Embeddings

Authors: Alexandru Tifrea*, Gary Becigneul*, Octavian-Eugen Ganea*

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, based on extensive experiments, we prove that our embeddings, trained unsupervised, are the first to simultaneously outperform strong and popular baselines on the tasks of similarity, analogy and hypernymy detection.
Researcher Affiliation Academia Alexandru T, ifrea , Gary B ecigneul , Octavian-Eugen Ganea Department of Computer Science ETH Z urich, Switzerland tifreaa@ethz.ch,{gary.becigneul,octavian.ganea}@inf.ethz.ch
Pseudocode Yes Algorithm 1 is-a(v, w) hypernymy score using Poincar e embeddings
Open Source Code Yes Our code is publicly available4. 4https://github.com/alex-tifrea/poincare_glove
Open Datasets Yes We trained all models on a corpus provided by Levy & Goldberg (2014); Levy et al. (2015) used in other word embeddings related work. Corpus preprocessing is explained in the above references. The dataset has been obtained from an English Wikipedia dump and contains 1.4 billion tokens.
Dataset Splits Yes In order to select the best t without overfitting on the benchmark dataset, we used the same 2-fold cross-validation method used by (Levy et al., 2015, section 5.1) (see our Table 15) which resulted in selecting t = 0.3. We report our main results in Table 4, and more extensive experiments in various settings (including in lower dimensions) in appendix A.2.
Hardware Specification No The paper does not specify the hardware used for training or experimentation, such as specific CPU/GPU models, memory, or cloud instance types.
Software Dependencies No The paper mentions optimizers like ADAGRAD and RADAGRAD but does not provide specific version numbers for any software dependencies (e.g., Python, TensorFlow, PyTorch, or specific library versions).
Experiment Setup Yes All models were trained for 50 epochs, and unless stated otherwise, on the full corpus of 189,533 word types. ... For the Euclidean baseline as well as for models with h(x) = x2 we used a learning rate of 0.05. For Poincar e models with h(x) = cosh2(x) we used a learning rate of 0.01.