Poincaré Embeddings for Learning Hierarchical Representations

Authors: Maximillian Nickel, Douwe Kiela

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we show that our approach can provide high quality embeddings of large taxonomies both with and without missing data. Moreover, we show that embeddings trained on WORDNET provide state-of-the-art performance for lexical entailment. On collaboration networks, we also show that Poincaré embeddings are successful in predicting links in graphs where they outperform Euclidean embeddings, especially in low dimensions.
Researcher Affiliation Industry Maximilian Nickel Facebook AI Research maxn@fb.com Douwe Kiela Facebook AI Research dkiela@fb.com
Pseudocode No The paper provides mathematical derivations and descriptions of the update rule but no explicitly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not include a statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conduct experiments on the transitive closure of the WORDNET noun hierarchy [21] [...] We test this property on HYPERLEX [37], which is a gold standard resource [...] We performed our experiments on four commonly used social networks, i.e, ASTROPH, CONDMAT, GRQC, and HEPPH.
Dataset Splits Yes To test generalization performance, we split the data into a train, validation and test set by randomly holding out observed links. [...] For evaluation, we split each dataset randomly into train, validation, and test set. The hyperparameters r and t were tuned for each method on the validation set.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes First, we initialize all embeddings randomly from the uniform distribution U( 0.001, 0.001). [...] Second, we found that a good initial angular layout can be helpful to find good embeddings. For this reason, we train during an initial 'burn-in' phase with a reduced learning rate η/c. In our experiments, we set c = 10 and the duration of the burn-in to 10 epochs. [...] For training, we randomly sample 10 negative examples per positive example. [...] where r, t > 0 are hyperparameters. Here, r corresponds to the radius around each point u such that points within this radius are likely to have an edge with u. The parameter t specifies the steepness of the logistic function and influences both average clustering as well as the degree distribution [19].