reproducibilityindex.ai

GCN meets GPU: Decoupling “When to Sample” from “How to Sample”

Authors: Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Anand Sivasubramaniam, Mahmut Kandemir

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also conduct extensive numerical experiments on different large-scale graph datasets and different sampling methods to corroborate our theoretical ﬁndings, and demonstrate the practical efﬁcacy of the proposed algorithm over competitive baselines. Overall, our empirical results demonstrate that LAZYGCN can signiﬁcantly reduce the number of sampling steps and yield superior speedup without compromising the accuracy.
Researcher Affiliation	Academia	Morteza Ramezani Pennsylvania State University morteza@cse.psu.edu Weilin Cong Pennsylvania State University wxc272@psu.edu Mehrdad Mahdavi Pennsylvania State University mzm616@psu.edu Anand Sivasubramaniam Pennsylvania State University anand@cse.psu.edu Mahmut T. Kandemir Pennsylvania State University kandemir@cse.psu.edu
Pseudocode	Yes	Algorithm 1 LAZYGCN training algorithm
Open Source Code	No	The paper does not provide a specific link or explicit statement about the release of its source code.
Open Datasets	Yes	We evaluate the effectiveness of LAZYGCN under inductive supervise setting on the following real-world datasets: Pubmed, PPI-Large, Flickr, Reddit, Yelp, and Amazon. Detailed information of these datasets are summarized in Table 1.
Dataset Splits	Yes	Table 1: Summary of datasets statistics. indicates multi-labels dataset Dataset Nodes Edges Degree Feature Classes Train / Validation / Test Pubmed 19,717 44,338 3 500 3 92% / 3% / 5% PPI-Large 56,944 1,612,348 15 50 121 66% / 12% / 22% Flickr 89,250 899,756 10 500 7 50% / 25% / 25% Reddit 232,965 11,606,919 50 602 41 66% / 10% / 24% Yelp 716,847 13,954,819 19 300 100 75% / 15% / 10% Amazon 1,598,960 264,339,468 124 200 107 78% / 5% / 15%
Hardware Specification	Yes	For instance, the memory capacity on a very recent GPU card, such as NVIDIA Tesla V100, is at most 32 GB
Software Dependencies	No	We implemented all these algorithms alongside LAZYGCN, using Py Torch [22] and Py Torch Geometric [10] for sparse matrix operations.
Experiment Setup	Yes	All our experiments are conducted using a 3-layer GCN with hidden dimension of 512 and Adam optimizer with a learning rate of 10-3. Test and validation accuracies (F1 score) are obtained by running the full-batch GCN. For nodewise, we used 5 neighbors to sample, for layerwise we used a sample size of 512, and for subgraph we used a sample size which is equal to the mini-batch size. For LAZYGCN training, we used ﬁxed R = 2 and ρ = 1.1 unless otherwise stated.