Faster Graph Embeddings via Coarsening
Authors: Matthew Fahrbach, Gramoz Goranci, Richard Peng, Sushant Sachdeva, Chi Wang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we investigate how the vertex sparsifiers SCHURCOMPLEMENT and RANDOMCONTRACTION affect the predictive performance of graph embeddings for two different learning tasks. |
| Researcher Affiliation | Collaboration | 1Google Research, New York, NY, USA 2Department of Computer Science, University of Toronto, Toronto, Canada 3Department of Computer Science, Georgia Institute of Technology, GA, USA 4Microsoft Research, Redmond, WA, USA. |
| Pseudocode | Yes | Algorithm 1: SCHURCOMPLEMENT |
| Open Source Code | No | The paper does not include an explicit statement or link indicating the release of source code for the methodology described. |
| Open Datasets | Yes | Datasets. The networks we consider and their statistics are listed in Table 1. Blog Catalog (Tang & Liu, 2009) models the social relationships of online bloggers... Flickr (Tang & Liu, 2009) is a network of user contacts... You Tube (Yang & Leskovec, 2015) is a social network... and Datasets. We build on the experimental framework in node2vec (Grover & Leskovec, 2016) and evaluate our vertex sparsification algorithms on the following datasets: Facebook (Leskovec & Krevl, 2014), arxiv ASTRO-PH (Leskovec & Krevl, 2014), and Protein-Protein Interaction (PPI) (Stark et al., 2006). |
| Dataset Splits | Yes | For the Blog Catalog experiments, we vary the training ratio from 10% to 90%, and for Flickr and You Tube we vary the training ratio from 1% to 10%. In all instances, we perform 10-fold cross validation and evaluate the prediction accuracy in terms of the mean Micro-F1 and Macro-F1 scores. |
| Hardware Specification | Yes | All of our experiments are performed on a Linux virtual machine with 64 Xeon E5-2673 v4 virtual CPUs (2.30GHz), 432GB of memory, and a 1TB hard disk drive. |
| Software Dependencies | No | The paper mentions software like Net MF, LINE, Deep Walk, and scikit-learn, but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | In all of our experiments, we use the minimum degree threshold = 30 for the SCHURCOMPLEMENT and RANDOMCONTRACTION algorithms. We choose the conventional target dimension of d = 128 for all graph embeddings. For LINE embeddings, we run Net MF with window size W = 1. For Deep Walk embeddings, we run Net MF with the window size W = 10 in the Blog Catalog and Flickr experiments, and we use the window size W = 2 for the You Tube network because of the density of the resulting random walk matrix. |