Bernoulli Embeddings for Graphs

Authors: Vinith Misra, Sumit Bhatia

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comparisons performed on five different graphical datasets are described in Section 4. Bernoulli embeddings are found to achieve significantly higher test-set mean average precision than a variety of alternative binary embedding options, including various quantizations of Deep Walk vectors (Perozzi, Al-Rfou, and Skiena 2014), Fiedler embeddings (Hendrickson 2007), and several other real-valued embeddings that we ourselves introduce (Table 2).
Researcher Affiliation Industry Vinith Misra,α Sumit Bhatiaβ αNetflix Inc., Los Gatos, CA, USA βIBM India Research Laboratory, New Delhi, India
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (e.g., a specific repository link or explicit statement of code release) for the methodology described.
Open Datasets Yes Results are evaluated on five datasets. ... KG (115K entities, 1.3M directed edges)2 is a knowledge graph extracted from the Wikipedia corpus... Wordnet (82K entities, 232K directed edges)... Slashdot (82K entities, 948K directed edges), Flickr (81K entities, 5.9M undirected edges), and Blog Catalog (10K entities, 334K undirected edges) are standard social graph datasets... 2http://sumitbhatia.net/source/datasets.html
Dataset Splits Yes Five percent of the edges of each dataset are held out for the test set, and the remainder are used in training.
Hardware Specification Yes Training a 25-dimensional embedding on the KG dataset the largest problem we consider takes roughly 10s per epoch on a 2.2GHz Intel i7 with 16GB of RAM... Times reported are averaged over 50 runs, ran on a system running Ubuntu 14.10, with 32GB RAM and 16 Core 2.3 GHz Intel Xeon processor.
Software Dependencies No The paper mentions the operating system 'Ubuntu 14.10' for the experimental setup but does not provide specific version numbers for any other key software components, libraries, or solvers.
Experiment Setup Yes The training set loss... is minimized with stochastic gradient descent with the diagonalized Ada Grad update rule (Duchi, Hazan, and Singer 2011). ... validation loss is typically minimized between 30 and 60 epochs.