reproducibilityindex.ai

Hierarchical Density Order Embeddings

Authors: Ben Athiwaratkun, Andrew Gordon Wilson

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our approach provides state-of-the-art performance on the WORDNET hypernym relationship prediction task and the challenging HYPERLEX lexical entailment dataset while retaining a rich and interpretable probabilistic representation. We show quantitative results on the WORDNET Hypernym prediction task in Section 4.2 and a graded entailment dataset HYPERLEX in Section 4.4.
Researcher Affiliation	Academia	Ben Athiwaratkun, Andrew Gordon Wilson Cornell University Ithaca, NY 14850, USA
Pseudocode	No	The paper describes methods and equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We make our code publicly available.1 1https://github.com/benathi/density-order-emb
Open Datasets	Yes	We have a similar data setup to the experiment by Vendrov et al. (2015) where we use the transitive closure of WORDNET noun hypernym relationships which contains 82, 115 synsets and 837, 888 hypernym pairs from 84, 427 direct hypernym edges. We obtain the data using the WORDNET API of NLTK version 3.2.1 (Loper & Bird, 2002).
Dataset Splits	Yes	The validation set contains 4000 true hypernym relationships as well as 4000 false hypernym relationships where the false hypernym relationships are constructed from the S1 negative sampling described in Section 3.5.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications.
Software Dependencies	Yes	We obtain the data using the WORDNET API of NLTK version 3.2.1 (Loper & Bird, 2002). We use the Adam optimizer (Kingma & Ba, 2014).
Experiment Setup	Yes	We use d = 50 as the default dimension... We initialize the mean vectors to have a unit norm and normalize the mean vectors in the training graph. We initialize the diagonal variance components to be all equal to β and optimize on the unconstrained space of log(Σ). We use a minibatch size of 500 true hypernym pairs... We use the Adam optimizer (Kingma & Ba, 2014) and train our model for at most 20 epochs. The hyperparameters are the loss margin m, the initial variance scale β, and the energy threshold γ.