GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding
Authors: Chenhui Deng, Zhiqiang Zhao, Yongyu Wang, Zhiru Zhang, Zhuo Feng
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have evaluated our approach on a number of popular graph datasets for both transductive and inductive tasks. Our experiments show that Graph Zoom can substantially increase the classification accuracy and significantly accelerate the entire graph embedding process by up to 40.8 , when compared to the state-of-the-art unsupervised embedding methods. |
| Researcher Affiliation | Academia | Chenhui Deng Cornell University, Ithaca, USA cd574@cornell.edu Zhiqiang Zhao Michigan Technological University, Houghton, USA qzzhao@mtu.edu Yongyu Wang Michigan Technological University, Houghton, USA yongyuw@mtu.edu Zhiru Zhang Cornell University, Ithaca, USA zhiruz@cornell.edu Zhuo Feng Stevens Institute of Technology, Hoboken, USA zfeng12@stevens.edu |
| Pseudocode | Yes | Algorithm 1: Graph Zoom algorithm |
| Open Source Code | Yes | 1Source code of Graph Zoom is freely available at: github.com/cornell-zhang/Graph Zoom. |
| Open Datasets | Yes | We include Cora, Citeseer, Pubmed, and Friendster for evaluation on transductive learning tasks, and PPI as well as Reddit for inductive learning. We follow the experiments setup in Yang et al. (2016) for three standard citation network benchmark datasets: Cora, Citeseer, and Pubmed. |
| Dataset Splits | No | We split the training and testing data in the same way as suggested in Kipf & Welling (2016); Hamilton et al. (2017). We allow only 20 labels per class for training and 1, 000 labeled nodes for testing. We use 60% nodes for training, 40% for testing on PPI and 65% for training and 35% for testing on Reddit. |
| Hardware Specification | Yes | We run all the experiments on a Linux machine with an Intel Xeon Gold 6242 CPU (32 cores @ 2.40GHz) and 384 GB of RAM. |
| Software Dependencies | No | The paper mentions various models and frameworks (e.g., Deep Walk, node2vec, DGI, Graph SAGE, PyTorch in code release link implies usage), but does not specify version numbers for any software libraries or dependencies used for the experiments. |
| Experiment Setup | Yes | In regard to hyperparameters, we use 10 walks with a walk length of 80, a window size of 10, and an embedding dimension of 128 for both Deep Walk and node2vec; we further set the return parameter p and the in-out parameter q in node2vec as 1.0 and 0.5, respectively. Moreover, we choose early stopping strategy for DGI with a learning rate of 0.001 and an embedding dimension of 512. Apropos the configuration of Graph SAGE, we train a two-layer model for one epoch, with a learning rate of 0.00001, an embedding dimension of 128, and a batch size of 256. |