Faster Graph Embeddings via Coarsening

Authors: Matthew Fahrbach, Gramoz Goranci, Richard Peng, Sushant Sachdeva, Chi Wang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we investigate how the vertex sparsifiers SCHURCOMPLEMENT and RANDOMCONTRACTION affect the predictive performance of graph embeddings for two different learning tasks.
Researcher Affiliation Collaboration 1Google Research, New York, NY, USA 2Department of Computer Science, University of Toronto, Toronto, Canada 3Department of Computer Science, Georgia Institute of Technology, GA, USA 4Microsoft Research, Redmond, WA, USA.
Pseudocode Yes Algorithm 1: SCHURCOMPLEMENT
Open Source Code No The paper does not include an explicit statement or link indicating the release of source code for the methodology described.
Open Datasets Yes Datasets. The networks we consider and their statistics are listed in Table 1. Blog Catalog (Tang & Liu, 2009) models the social relationships of online bloggers... Flickr (Tang & Liu, 2009) is a network of user contacts... You Tube (Yang & Leskovec, 2015) is a social network... and Datasets. We build on the experimental framework in node2vec (Grover & Leskovec, 2016) and evaluate our vertex sparsification algorithms on the following datasets: Facebook (Leskovec & Krevl, 2014), arxiv ASTRO-PH (Leskovec & Krevl, 2014), and Protein-Protein Interaction (PPI) (Stark et al., 2006).
Dataset Splits Yes For the Blog Catalog experiments, we vary the training ratio from 10% to 90%, and for Flickr and You Tube we vary the training ratio from 1% to 10%. In all instances, we perform 10-fold cross validation and evaluate the prediction accuracy in terms of the mean Micro-F1 and Macro-F1 scores.
Hardware Specification Yes All of our experiments are performed on a Linux virtual machine with 64 Xeon E5-2673 v4 virtual CPUs (2.30GHz), 432GB of memory, and a 1TB hard disk drive.
Software Dependencies No The paper mentions software like Net MF, LINE, Deep Walk, and scikit-learn, but does not provide specific version numbers for any of them.
Experiment Setup Yes In all of our experiments, we use the minimum degree threshold = 30 for the SCHURCOMPLEMENT and RANDOMCONTRACTION algorithms. We choose the conventional target dimension of d = 128 for all graph embeddings. For LINE embeddings, we run Net MF with window size W = 1. For Deep Walk embeddings, we run Net MF with the window size W = 10 in the Blog Catalog and Flickr experiments, and we use the window size W = 2 for the You Tube network because of the density of the resulting random walk matrix.