reproducibilityindex.ai

Exphormer: Sparse Transformers for Graphs

Authors: Hamed Shirzad, Ameya Velingker, Balaji Venkatachalam, Danica J. Sutherland, Ali Kemal Sinop

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate the empirical performance of graph transformer models based on EXPHORMER on a wide variety of graph datasets with graph prediction and node prediction tasks (...). We present experiments on fifteen benchmark datasets (...). In summary, our experiments show that: (a) EXPHORMER achieves SOTA performance on a variety of datasets, (b) EXPHORMER consistently outperforms other sparse attention mechanisms while often surpassing dense transformers using fewer parameters, and (c) EXPHORMER successfully allows Graph GPS to overcome memory bottlenecks and scale to larger graphs (on > 10, 000 nodes) while still providing competitive performance.
Researcher Affiliation	Collaboration	1Department of Computer Science, University of British Columbia, Vancouver, BC, Canada 2Google Research, Mountain View, California, USA 3Google, Mountain View, California, USA 4Alberta Machine Intelligence Institute, Edmonton, Alberta, Canada. Correspondence to: Hamed Shirzad <shirzad@cs.ubc.ca>, Ameya Velingker <ameyav@google.com>, Balaji Venkatachalam <bave@google.com>.
Pseudocode	No	No structured pseudocode or algorithm blocks are present. Algorithms are described in natural language within the text, for example, "The process we use to construct a random expander graph is described below. (...) Pick d/2 permutations π1, π2, . . . , πd/2 on V , each πi chosen independently and uniformly among all n permutations. Then, choose E = (i, πj(i)), (i, π 1 j (i)) : j [d/2], i [n]".
Open Source Code	Yes	Codes can be found at https: //github.com/hamed1375/Exphormer.
Open Datasets	Yes	In this section, we evaluate the empirical performance of graph transformer models based on EXPHORMER on a wide variety of graph datasets with graph prediction and node prediction tasks (Dwivedi et al., 2020; Hu et al., 2020; Freitas et al., 2021; Shchur et al., 2018; Namata et al., 2012). In particular, we present experiments on fifteen benchmark datasets, including image-based graph datasets (CIFAR10, MNIST, Pascal VOC-SP, COCO-SP), synthetic SBM datasets (PATTERN, CLUSTER), code graph datasets (Mal Net-Tiny), and molecular datasets (Peptides-Func, Peptides-Struct, PCQM-Contact). Moreover, we demonstrate the ability of EXPHORMER to allow graph transformers to scale to larger graphs (with more than 5,000 nodes) by including results on five transductive graph datasets consisting of citation networks (CS, Physics, ogbn-arxiv) and co-purchasing networks (Computer, Photo).
Dataset Splits	Yes	Moreover, the nodes of the citation graph are split into 90K training nodes, 30K validation notes, and 48K test nodes.
Hardware Specification	Yes	For instance, on the Mal Net Tiny dataset, Graph GPS with a full transformer runs out of memory (on a 40GB NVIDIA A100 GPU) with a batch size of just 16; as we shall see, our techniques allow us to extend the batch size to 256 without any memory issues.
Software Dependencies	No	No specific version numbers for software dependencies are provided. The paper mentions frameworks like "Graph GPS", "PyTorch", "Performer", and "Big Bird", but without accompanying version details.
Experiment Setup	Yes	Our hyperparameter choices, including the optimizer, positional encodings, and structural encodings, were guided by the instructions in Graph GPS (Ramp asek et al., 2022). (...) To make fair comparisons, we used a similar parameter-budget to Graph GPS. For PATTERN and CLUSTER, we used a parameter-budget of 500K, and for CIFAR and MNIST, we used a parameter-budget of around 100K. See details in Tables 7 to 9. (Tables 7, 8, and 9 provide specific values for Num Layers, Hidden Dim, Num Heads, Dropout, Batch Size, Learning Rate, Num Epochs, Expander Degree, Num Virtual Nodes, etc.)