Inductive Representation Learning on Large Graphs

Authors: Will Hamilton, Zhitao Ying, Jure Leskovec

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.
Researcher Affiliation Academia William L. Hamilton wleif@stanford.edu Rex Ying rexying@stanford.edu Jure Leskovec jure@cs.stanford.edu Department of Computer Science Stanford University Stanford, CA, 94305
Pseudocode Yes Algorithm 1: Graph SAGE embedding generation (i.e., forward propagation) algorithm
Open Source Code Yes Code and links to the datasets: http://snap.stanford.edu/graphsage/
Open Datasets Yes We use an undirected citation graph dataset derived from the Thomson Reuters Web of Science Core Collection, corresponding to all papers in six biology-related fields for the years 2000-2005. ... We constructed a graph dataset from Reddit posts made in the month of September, 2014. ... Code and links to the datasets: http://snap.stanford.edu/graphsage/
Dataset Splits Yes We train all the algorithms on the 2000-2004 data and use the 2005 data for testing (with 30% used for validation). ... We use the first 20 days for training and the remaining days for testing (with 30% used for validation).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only mentions that models were implemented in TensorFlow.
Software Dependencies No All models were implemented in Tensor Flow [1] with the Adam optimizer [16] (except Deep Walk, which performed better with the vanilla gradient descent optimizer). We used Gen Sim word2vec implementation [30] and GloVe Common Crawl word vectors [27]. However, specific version numbers for these software components are not explicitly provided.
Experiment Setup Yes For all the Graph SAGE variants we used rectified linear units as the non-linearity and set K = 2 with neighborhood sample sizes S1 = 25 and S2 = 10 (see Section 4.4 for sensitivity analyses).